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ABSTRACT: This study uses data on 123 firms over an 11 year period to examine | 
whether the accounting for employee stock options permits them to be used as part 
of an income management strategy. Using a pooled cross-sectional, time-series 
analysis, the value of options granted Is found to be negatively related to the extent 
the firm is below its target level of income and positively related to the firm's use of 
income-Increasing accounting methods. | also find weak evidence of a positive 
relation between the firm's relative use of income-increasing accounting methods 
and the probability of issuing unattached stock options rather than income- 
decreasing securitles such as stock appreclation rights or tandem securitles. 
However, the results from both tests are sensitive to the estimation method and are 
not consistent over time. 


Key Words: Executlve stock options, Executive compensation, Financial 
reporting costs. 


Data Avallability: Data used in this study are avallable from the author upon 
request. 


I. INTRODUCTION 


he current financial reporting rules applicable to employee stock options (ESOs) allow 
firms to compensate managers without any charge against earnings. This treatment has 
drawn the attention of the Financial Accounting Standards Board (FASB) as well as the 
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popular press. For example, Thomas Stewart (1990, 94) examines the question of why-options are 
flourishing, and concludes: “A principal-and astonishing-reason is that stock options are the only 
major form of compensation that never shows up as a cost on a company’s profit-and-loss 
statement.” Rubin (1988, 35) quotes Garry N. Teesdale, Vice-President, The Hay Group as stating 
“if you want a good solid illustration of how companies are driven by the technical issues, watch 
what happens to the technology companies if FASB puts through their proposed change in 
accounting for stock options. We calculated the annual impact on the bottom line at one major 
semiconductor company; if they continue their option grant pattern, their annual hit to earnings 
will be $40 million per year just as a result of the change in FASB rules.” Although there has been 
speculation that the (favorable) financial reporting treatment has encouraged the use of employee 
stock options, there has been little empirical evidence to support such claims. 

To date, academic research has concentrated on the incentive and tax effects of ESOs and 
largely ignored their income effects.! Yet, if the board of directors believes the firm benefits from 
reporting higher income levels, then the favorable income effect of ESOs reduces their net cost. 
Therefore, I hypothesize that the greater the income benefit, the greater the relative use of ESOs. 
That is, firms use the value of options granted as part of an earnings management strategy. 

To test this hypothesis, I regress the value of options granted per employee against (1) the 
extent income is below a target level and (2) the extent the firm uses income-increasing 
accounting methods. The first variable measures a firm's propensity to use options as part of a 
short-term income strategy (1.e., substitute options for other forms of compensation to boost 
income in a particular year). The second variable evaluates the use of options as part of a long- 
term income strategy. The regression also includes variables to control for tax, incentive, and 
liquidity factors. The results provide weak evidence of a positive relation between the value of 
ESOs issued by a firm and the firm's use of income-increasing accounting methods, and a 
negative relation between the value of ESOs and the extent to which income falls below a target 
level. However, the results are not consistent across research methods or over time. 

Ialso conduct a second test that controls for incentives by restricting the analysis to securities 
with similar return distributions. This test examines the decision to form a tandem security by 
attaching a stock appreciation right (SAR) to the option? By exercising the SAR, the manager 
saves transaction costs. However, if the manager chooses to exercise the SAR, the firm records 
an expense for the appreciation in the price of the stock. The results generally support the 
prediction firms that use more income-increasing accounting methods are more likely to grant 
unattached stock options. 

Taken together, the results provide weak evidence of the use of ESOs as part of an income 
management strategy. They also suggest that the proposed change in the financial reporting 
treatment of stock options is likely to reduce the use of ESOs for some firms. However, 
inconsistencies in the results across research methods and over time indicate the need for further 
research. In addition, the study is subject to several important limitations, which are discussed in 
the Conclusion section of the paper. 

The paper continues as follows: Section IL outlines tax and financial reporting treatments for 
employee stock options and stock appreciation rights. Section III includes a discussion of the 


! Analytical papers include Haugen and Senbet (1981), Hagerty et al. (1990) and Lambert et al. (1991). Empirical papers 
include Agrawal and Mandelker (1987), DeFusco et al. (1990), Hagerty et al. (1990), Lambert et al. (1989), and 
Matsunaga et al. (1992). 

? An SAR enables the holder to receive the appreciation in the value of the stock over the exercise price in cash. It differs 
from an option in that the manager does not have to buy the stock (exercise the option) and resell the stock in the open 
market. Relative to a stock option, an SAR will generally reduce reported income. 
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TABLE 1 
Tax and Financial Reporting Treatments for Selected Equity Securities 


Panel A: Tax Treatment 


Sale 

Exercise of Stock 
Nonqualified stock option: 
Individual t,*(S,- X) GI G,- SQ 
Corporation -L*(S, - X) No Effect 
Incentive stock option: 
Individual No Effect tg * (S,- X) 
Corporation No Effect No Effect 
Stock appreciation right: 
Individual ta*(S,- X) Not Applicable 
Corporation -t.*(S,- X) Not Applicable 
Panel B: Financial Reporting Treatment 

End 

of Year Exercise 

Stock option No Entry* Dr. Cash 
Cr. Stockholders Equity 

Stock appreciation right Dr. Expense? Dr. Liability 

Cr. Liability Cr. Cash 

[(S, - X)* VP] 
S, = Stock price at the end of year t. 
X = Exercise price. 
S, = Stock price on date of exercise. 
S, = Stock price on date of sale. 
t, = Individual ordinary income tax rate. 
t, = Individual long-term capital gains tax rate. 
t, = Corporate income tax rate. 
VP = Adjustment for vesting period: the expense is allocated to individual years 

over the vesting period. 


* This assumes that the exercise price is set equal to the fair market value of the underlying stock on the measurement 
date. 

* The liability is recomputed each year based upon the fair market value of the underlying stock as of the balance sheet 
date and the extent to which the SAR is vested. The resulting change in the liability is offset by an entry to compensation 
expense. 
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research design and results of the tests on the value of options granted, while section IV 
on a discussion of the research design and results of the tests of the choice of equity : 
Finally, Section V summarizes the conclusions and includes suggestions for futher res 


II. TAX AND FINANCIAL REPORTING TREATMENTS FOR EMPLOYEE $ 
OPTIONS AND STOCK APPRECIATION 


Tax Treatment 


Panel A of table 1 summarizes the tax treatments of stock options and stock app! 
rights. The tax code classifies stock options as Incentive Stock Options (ISOs) or Nonc 
Stock Options (NSOs).? 

The Internal Revenue Code (section 422) defines Incentive Stock Options (ISOs) a: 
issued to employees under an Incentive Stock Option Plan that meets the following coi 


A. The life of the option may not exceed ten years. 

B. The option price must be no less than the fair market value of the stock on the 
issuance. 

C. The option must be transferable by inheritance only. 

D. The option plan must specify the aggregate number of shares that may be issued an 
employees eligible to receive the option. 

E. The option must be granted within ten years of the earlier of the date of the adoptio 
plan or the date it was approved by the shareholders. 

F. If the employee owns more than ten percent of the company, the option price must 

` least 110 percent of the market value and its life may not exceed five years. 

G. The employee must hold the stock for a minimum of two years after the option is g 

and for one year after the option is exercised. 


These conditions are generally not restrictive, and most NSOs also meet these crit 
primary difference between NSOs and ISOs is the tax treatment selected. In addition, ! 
reporting rules do not distinguish between the two types of employee stock options. 

ISOs confer two tax benefits to the individual relative to NSOs and SARs. First, the in 
applies the capital gains rate, rather than the ordinary income rate, on the gain from the 
of an ISO, and second, the individual pays the tax in the year the underlying stock is so 
than the year of exercise.* The latter reduces the present value of the tax liability an 
individuals to use the proceeds from the sale of the stock to pay the tax. On the other har 
the firm receives a deduction for the manager’s gain on the exercise of an NSO or an * 
firm is not allowed to deduct the gain on the exercise of an ISO. 


Financial Reporting Treatment 


Panel B of table 1 summarizes the financial reporting treatments of stock options a 
appreciation rights. The difference between the exercise price of the option and the fai 
value of the firm’s stock on the measurement date determines the compensation expense 
options.? If the exercise price is greater than or equal to the fair market value of the un 


* Since the tax treatments of NSOs and SARs are virtually identical, the two securities are discussed togeth 

* From 1979 - 1986, individuals were allowed to exclude 60 percent of long-term capital gains from taxable 
1987 the capital gains exclusion was repealed. 

5 APB #25 defines the measurement date as the first date when both the exercise price and the number of sha 
have been determined, which is often the date of grant. 
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stock on the measurement date, the firm does not recognize any compensation expense over the 
life of that option.® In this case, the firm only records an entry when the manager exercises the 
ESO (cash and stockholders equity are increased by the exercise price). 

The firm records a liability for stock appreciation rights at the end of each year to reflect the 
appreciation in price from the date of grant (adjusted for a vesting period). The change in the 
liability increases or decreases reported income for the firm. As a result, there is generally an 
inverse relation between the effect on earnings and annual price appreciation. 

Some firms grant SARs in tandem with either NSOs or ISOs. The tandem securities enable 
the holder to choose which security to exercise. Therefore, a manager holding a tandem security 
with an ISO can choose to exercise either the SAR or the ISO based upon the prevailing tax rates. 
` According to FASB Interpretation # 28, a firm should account for a tandem security based upon 
which security the firm expects the manager to select, with a presumption that the manager will 
exercise the SAR.’ Therefore, attaching an SAR to an ESO increases expected compensation 
expense. 

If a firm receives a benefit (increase in firm value) from reporting higher levels of income, 
then the ability to avoid the recognition of compensation expense reduces the effective cost of 
issuing ESOs. These benefits (summarized in Watts and Zimmerman 1986, 210-243) generally 
relate to explicit and implicit contracts that rely upon reported income figures. 

An increase in the proportion of stock options in the contract (which would boost reported 
income) is costly to the firm if it results in a shift from the optimal incentive and risk-sharing 
contract.’ Thus, the firm faces a trade-off of the financial reporting advantage of stock options 
against the additional incentive and risk-sharing costs incurred. 

Past research has generated evidence of firm decisions being influenced by the effect on 
reported income. For example, Hand (1989) and Hand et al. (1990) suggest that firms undertake 
financial transactions in an attempt to boost income to reach a time-series trend. Matsunaga et al. 
(1992) present evidence that the transaction's effect on income influences a firm's decision to 
induce managers to disqualify their incentive stock options.? 

The financial reporting rules applicable to ESOs provide firms with another means of 
boosting reported income. In particular, if a firm expects reported income to be low, it could 
reduce reported compensation expense by substituting ESOs for other forms of compensation. 
This leads to the following hypothesis (stated in alternative form): 


Hypothesis 1: Firms adjust the value of employee stock option grants as part of a short-term 
income management strategy. 


Adjusting the value of ESO grants is not the only method of managing income. Studies such 
as DeAngelo (1988), McNichols and Wilson (1988), and Jones (1991) present evidence that firms 
manage accruals as part of a short-term income strategy. If a firm is able to achieve its desired 


ô The vast majority of stock options are issued with the exercise price set equal to the fair market value on the date of grant. 
Of the 135 firms in the preliminary sample identified in this study only seven (five percent) issued discount options, i.e., 
options with an exercise price below the fair market value of the underlying security. 

? By choosing to exercise the SAR, the manager saves the transaction costs associated with the exercise of the stock option 
and the sale of the stock received from the exercise. In addition, during the test period, SEC restrictions on insider trading 
required stock received from the exercise of stock option to be held for a six month period. The manager can avoid the 
firm-specific risk of holding the stock by choosing the SAR. 

* This holds unless the returns on the options are hedged by the granting firm (see Hemmer 1993). 

? Under certain circumstances a firm could benefit from paying managers to disqualify ISOs. Managers can disqualify 
an ISO by selling the underlying stock less than a year after the option was exercised. The sale converts the ISO into 
an NSO for tax purposes, whereby the firm receives a tax deduction for the manager’ s gain on the exercise of the option. 
However, the amount paid to the manager to compensate for the lost individual tax benefits of ISOs reduces reported 
In COITe. 
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income level using these other methods (and can do so at a lower cost) then the financial reporting 
benefit from adjusting its stock option grants is reduced. 

The use of employee stock options could also reflect a long-term income strategy (short-term 
and long-term income management strategies are not mutually exclusive). As noted by Zmijewski 
and Hagerman (1981), firms place differential values on reporting higher levels of income. 
Therefore, the value a firm places on reporting higher levels of earnings could influence the firm’s 
use of ESOs. This leads to the following hypothesis (stated in alternative form): 


Hypothesis 2: Firms adjust the value of employee stock option grants as part of a long-term: 
income management strategy. 


IIl. TESTS OF THE VALUE OF STOCK OPTION GRANTS 
Sample 


The test period for this study includes the 11 years from 1979-1989, inclusive. Table 2 shows 
the derivation of the final sample of 123 firms, which are listed in the appendix. To be included 
in the sample, firms must (a) come from a non-regulated industry, i.e., excluding transportation, 
utility, and financial service industries, (b) be listed on either the NYSE or AMEX during the 


TABLE 2 
Sample Firm Selection 
Total # of Firms on Compustat (1991) 6,970 
Firms in 4000 - 4999 or 
6000 - 6999 SIC codes* (1.624) 
5,346 
Firms not listed on AMEX or NYSE (3,769) 
1,577 
Firms with non-December fiscal year-end (444) 
l, 133 
Firms without Compustat Data (398) 
for the required period* 
535 
Firms without R&D expense data for the required period (297) 
238 
Firms without unamortized past service cost* (87) 
l 151 
Firms without CRSP data an 
Preliminary Sample 134 
Other Items? Qn 
Final Sample 123 


* The 4000 SIC code consists of transportation and utility firms and the 6000 SIC code contains financial institutions, 
brokers, insurance and real estate firms. These are considered to be regulated industries. 

> The required data period is the 1979 through 1989 test period plus a 1978 estimation period. 

* This item, D90, was required for one year in the test period to identify firms that would disclose an 
amortization period of past service costs. 

4 Other items (# of firms) include foreign firms (4) and firms that issued discount options (7). 
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entire sample period, (c) have a December fiscal year-end, (d) have data on Compustat for the 
entire test period, (e) have research and development expenditures (d46) and unfunded past or 
prior service cost (d90) on Compustat, and (f) have return data on CRSP for the entire sample 
period and an additional one year estimation period. The exclusion of regulated firms eliminates 
incentive differences resulting from the regulatory environment. Proxy statements for OTC firms 
were not available at the University of Washington library at the time the study was conducted. 
The year-end restriction ensures comparability regarding the timing of tax changes. The 
restriction for research and development expenditures increases the probability that sample firms 
will.use stock options in their compensation contracts.!? The restriction on past or prior service 
costs limits the sample to firms with an equal number of observed accounting choices. I also 
eliminate 11 firms because they are foreign firms or issued discount options. Thirty-six firm-year 
observations are lost because disclosures are not sufficient to estimate the value of the ESO grant, 
leaving a total of 1,317 firm-year observations. 

As expected, given the survivorship bias and data restrictions, the sample firms tend to be 
larger and more profitable than the average Compustat firm. The median ROA and firm value 
(market value of equity plus book value of debt) for the final sample firms were 7.3 96 and $1,489 
million, respectively. In contrast the median ROA and firm value for all Compustat firms over the 
test period were 3.996 and $66 million, respectively. Despite the R&D requirement, the median 
market to book ratio for the sample firms (1.42) is only slightly higher than the median ratio for 
all Compustat firms (1.37). To the extent that larger and more successful firms have lower 
financial reporting costs, the bias in the sample composition reduces the power of the tests. 


Research Design — 


The hypotheses predict a positive relation between a firm's use of employee stock options 
and the firm's financial reporting benefits. The tests consist of regressions of the value of ESOs 
granted on test variables that reflect the firm's financial reporting benefits and control variables 
for taxes, incentives and cash flow. The variables and the formulas used to calculate their values 
are summarized in table 3. 


Measurement of the Value of Stock Options 


Following the procedure used by Antle and Smith (1986) and Hagerty et al. (1990), stock 
options are valued using the dividend-adjusted Black-Scholes model. 


BS(P,r, c?, T,8) = Pe STN(d,) — Xe-'TN(d;) (1)! 


= 2 
due [r T iT (1a) 


d, -d,-o4T (1b) 


P? Evidence from Clinch (1991) indicates that the probability a firm will use ESOs is positively associated with the firm's 

. R&D expenditures. Therefore, this screen was used to limit the number of firms (i.c., limit the costs of hand collecting 
data) to those firms that are most likely to grant options during the sample period. This restriction increases the power 
of tests that examine changes in option grants over time. 

!! As discussed in Lambert et aL (1991), the Black-Scholes model is likely to overstate the value of non-tradable options 
granted to risk-averse managers. The extent of the overstatement is determined by the manager's risk aversion and the 
other components of the manager's investment portfolio. To the extent that these characteristics vary across managers 
(and thus across firms) or over time, the use of the Black-Scholes model is likely to induce noise in the estimation, i.e., 
to the extent this measurement error is uncorrelated with the right-hand side variables, it is a noise (power) problem, 
not a bias problem. The sensitivity of the tests to the valuation of ESOs is explored in the discussion of the results. 

Ulf P = X, after factoring out P expression (1) becomes (continued on page 9) - 
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TABLE 3 
Listing and Definition of Variables 


Dependent Variables 
V Black-Scholes value of stock options granted per employee in 1979 dollars 
Security 1 if the firm issued unattached stock options 
0 if the firm issued tandem securities or SARs 
Test Variables 
NIA (Net Income - SO - Target) / Total Assets if Net Income - SO > Target and 0 otherwise Target 
= NI, , + (NI, - NI, 5 if NI, > NL, and NI, otherwise 
NIB (Net Income - SO - Target) / Total Assets if Net Income - SO « Target and 0 otherwise 
ACC Proportion of income increasing accounting methods used by the firm 
ACC2 Proportion of income increasing accounting methods (excluding inventory method) 
Control Variables 
DYear Dummy variables for each sample year (excluding 1979) 
DSIC Dummy variable for each of the four most common 2 digit SIC codes 


D81TX The estimated marginal tax rate if year is 1981 or later and zero otherwise 


Market to Book Ratio Market value of equity--Book value of debt+Bock value of preferred stock 


(MKBK) l Beginning total assets 

R&D expense (RD) Research and Development Expenditures 
Beginning Total Assets 

Firm specific risk (RSK) Variance of Market Model Residuals 

Total Variance 

Firm size (SIZ) 1 
Beginning Total Assets 

Leverage (LEV) Beginning Book Value of Debt 
Beginning Total Assets 

Liquidity (LIQ) Beginning Current Assets-Current Liabilities 
Beginning Total Assets 

Insider ownership (INSD) Percentage of the firm held by insiders 


BSV Black-Scholes value of stock options (or stock appreciation rights) 


ne pe HC HSS en re 
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where, 

is the fair market value of the underlying stock on the date of grant. 

is the exercise price on the option. Note that for all observations in the sample P=X.” 
is the riskless rate of interest, represented by the average 90 day T-bill rate for the year. 
is the time to maturity.“ 

is the annualized variance of daily returns on the stock for the year in which the options 
were issued. : 

is the annualized dividend yield on the stock estimated using the last dividend declared 
during the fiscal year and the year-end price. 


n QM yo 


The dependent variable includes all options granted by the firm in a given year expressed 
in constant (1979) dollars using the inflation rates reported by Ibbottson Associates (1992). 
Ideally, I would deflate the value of options by the other compensation components to capture the 
substitution of stock options for cash. However, that information is not available on a firm-wide 
basis, so the value of options issued is deflated by the number of employees (Compustat item 
d29)." 

Panel A of table 4 presents descriptive statistics of the distribution of the Black-Scholes value 
of employee stock options per employee (in 1979 dollars) granted by the sample firms. The mean 
(median) value of stock options granted for the pooled sample is $136 ($68). The growth in the 
value of option grants after 1981 can be attributed to the passage of the 1981 Economic Recovery 
Tax Act that created ISOs. On the other hand, the jump in 1987 is likely due to new issues of stock 
options after the stock market crash left many old issues “out of the money.” 


Financial Reporting Benefits 


Hypothesis 1 suggests that firms use ESOs as a tool to manage short-term reported income. 
As in Matsunaga et al. (1992), I assume costs arise from reporting net income below a target level 
because of various agreements which either implicitly or explicitly rely on reported income. 
Target income is assumed to follow a random walk with drift, if the estimated drift is positive, and 
a random walk without drift if the estimated drift is negative.'* I estimate the drift as the average 
change in income before extraordinary items over the prior five years." To estimate income 


Footnote 12 (continued from page 7) 
P * [e*™N(d,) - e*™N(,)] 
The term in brackets can be thought of as a multiplier of the exercise price. In other words, one can estimate the Black- 
Scholes value of an option by multiplying the exercise price by the term in brackets. 
For the final sample, on average the Black-Scholes value of an option is approximately 40 percent of the exercise price. 
The distribution appears to te fairly tight, i.e., half of the observations fall between 30 percent and 50 percent, and fairly 
stable over time. Thus, although the Black-Scholes value can be a complex calculation, as a rule of thumb, the value 
(given the estimation methods and assumptions listed earlier) tends to be between 30 percent to 50 percent of the exercise 
price. 

I Because the financial reporting and tax effects differ for options issued at a discount, 1.e., where the exercise price is 
below the fair market value on the date of grant, the seven firms that met the sample criteria but issued discounted options 
were excluded from the sample. 

^Most option plans state only that the options can have a life no greater than ten years, with the actual maturity left to 
the discretion of the compensation committee of the board. T is assumed to be equal to ten, unless a different life, e.g. 
five years, is explicitly stated. 

15 The disclosure of the exercise prices for the options differs over time for certain firms as well as across firms. If the firm 
disclosed the exercise price in the annual report or the proxy statement, I use the actual price. If firms only disclose a 
range of prices, I use the midpoint of the range. 

“In the sample, there were 364 observations in which the drift term was negative. 

The tests are also conducted with target income defined as a random walk with drift (regardless of sign). In addition, 
tests are run using a dummy variable equal to one if the firm is below target. The qualitative results of the tests using 
thesc alternative specifications are similar to the ones reported. 
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TABLE 4 
Descriptive Statistics on the Distribution of the Value of Option Grants 


Panel A: Value of options granted per employee (in 1979 doHars) 


Year Mean 1st Quartile Median 3rd Quartile 90th % 
POOLED 136.3 6.2 67.6 167.4 333.8 
1979 72.0 0.0 21.3 74.0 208.3 
1980 110.1 0.0 33.9 118.2 287.2 
1981 93.3 0.0 31.9 91.2 250.5 
1982 101.9 1.8 61.7 125.2 293.1 
1983 135.6 10.2 66.5 148.6 286.6 
1984 127.6 17.9 68.7 152.3 308.2 
1985 131.5 13.8 67.6 140.2 319.3 
1986 152.4 24.5 88.1 208.7 390.2 
1987 220.7 67.5 140.3 296.9 497.7 
1988 161.7 38.7 87.5 225.6 380.5 
1989 198.7 26.4 115.6 250.2 539.8 


Panel B: Value of options granted as a percentage of the absolute value of net income 


Year Mean Ist Quartile Median 3rd Quartile 90th 96 
Pooled 9.396 0.296 1.996 4.596 10.296 
1979 1.796 0.0% 0.6% 1.9% 5.0% 
1980 6.3% 0.0% 1.0% 3.1% 6.9% 
1981 2.6% 0.0% 1.0% 2.8% 5.5% 
1982 4.4% 0.1% 1.9% 4.8% 8.6% 
1983 6.4% 0.2% 2.4% 5.3% 12.4% 
1984. 5.3% 0.5% 1.9% 4.0% 8.3% 
1985 5.7% 0.4% 2.4% 4.7% 10.6% 
1986 5.9% 0.6% 2.6% 5.8% 14.4% 
1987. 20.7% 1.4% 4.4% 7.2% 11.4% 
1988 32.1% 0.6% 1.9% 4.2% 11.7% 


1989 11.9% 0.6% 2.9% 6.3% 16.0% 
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without the ESO grant, the dividend adjusted Black-Scholes value of ESOs granted (SO) is 
subtracted from current year reported income.” 

NIA represents the deviation of adjusted income from target income for firms above their 
target level and NIB represents the deviation of adjusted income from target income for firms 
below their target level. In this study, I hypothesize firms below their target level of income will 
substitute options for other forms of compensation to boost their reported income. This suggests 
a negative relation between the value of options granted and NIB.” 

The expected relation between the use of ESOs and NIA is less clear. If managers adjust their 
use of stock options to smooth reported income, one would expect a negative relation between 
the value of stock options granted and NIA. On the other hand, if total compensation (both cash 
compensation and ESOs) reflects a pay for performance incentive scheme, one would expect a 
positive relation between the use of ESOs and NIA.” 

To provide information on the magnitude of employee stock option grants, Panel B of table 
4 presents the value of ESO grants as a percentage of the absolute value of net income. Although 
the median percentage is only 1.9% (perhaps reflective of the bias towards larger and more 
successful firms) the distribution is highly skewed. For the top 10 percent of the observations, the 
value of options granted represents at least 10 percent of reported net income. 

Hypothesis 2 suggests that firms use ESOs as part of a long-term earnings management 
strategy. The ACC variable, which is based upon the work of Press and Weintrop (1990) and 
Zmijewski and Hagerman (1981), measures the extent to which the firm employs an income- 
increasing accounting strategy. The income-increasing (decreasing) choices are (a) FIFO 
(LIFO), (b) straight-line (accelerated) depreciation, (c) an amortization period of past service 
costs greater than or equal to 30 years (less than 30 years), and (d) accounting for the investment 
tax credit using the flow-through (deferral) method.” For the pooled sample, approximately 
18.6% of the observations used accelerated depreciation methods, 61.3% used LIFO, 24.2% used 
amortization periods of less than 30 years, and 11 percent used the deferral method of accounting 
for the investment tax credit. 


18 Although the theoretical literature suggests that the cash equivalent value of stock options is below the Black-Scholes 
value (see Huddart 1994), there is little empirical evidence with regard to the magnitude of the “discount.” The 
sensitivity of the tests to alternative “discounts” is explored in the discussion of the results. 

19 Another income pattern suggested in the literature is that of the “big bath.” In this context, this would suggest that firms 
that are way below their target income level would reduce the amount of stock options granted and increase the amount 
of cash compensation. While this is not expected to be the case, if this practice is followed, it will reduce the power of 
the test on NIB. 

2 Adoptions of stock option plans and amendments to the plan (including expanding the number of shares available for 
option grants) must be approved by shareholders. However, once firms have approved plans, the board of directors has 
discretion over the number of options granted, as long as the number is within the approved limit. In addition, firms can 
also grant options in a given year in excess of the approved limit subject to the subsequent approval of the option plan 
or plan amendment at the next shareholders" meeting. The timing of option grants within the fiscal year is not available 
for sample firms during the test period. This information was available in the 1993 proxy statements for 87 sample firms. 
A total of 39 percent of the 1992 option grants occurred during the first quarter, 28 percent in the second quarter, 15 
percent in the third quarter and 17 percent in the fourth quarter. 

A This variable assumes that accounting choices and ESOs are complementary methods of increasing income. 

Z'The classification of an accounting choice as income-increasing or -decreasing follows the classification used by 
Hagerman and Zmijewski (1979). In general, FIFO is income-increasing if factor prices rise and production is sufficient 
to avoid the liquidation of LIFO inventory layers. The remaining income-increasing accounting choices spread expenses 
over a longer period of time (straight-line depreciation and a longer pension cost amortization period) or accelerate the 
recognition of expense credits (flow-through ITC). As long as factor prices remain stable (or rise) and the firm maintains 
its current activity (replaces its assets) the classification of these choices as income-increasing should hold. However, 
there could be isolated cases in which the accounting choice is misclassified. This would serve to induce measurement 
error in the variable. 
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The new pension accounting rules, adopted by many firms in 1986, eliminated the choice of 
an amortization period for prior service costs and the 1987 tax regulations eliminated the 
investment tax credit. Therefore, the number of available accounting choices changed during the 
sample period. To control for the change, the accounting choice variable is scaled by the number 
of choices available. 

Skinner (1993) shows that the inventory accounting choice reflects the value of a firm’s 
investment opportunities. Therefore, I also conduct tests using an alternative variable, ACC2, that 
does not include the inventory accounting choice. The tests of ACC2 include observations from 
the sub-period ending in 1985 during which the remaining three choices were available. 


Taxes 


As discussed earlier, incentive stock options impose a tax cost on the firm through the denial 
of a tax deduction. Because options are generally granted to highly paid executives who face 
similar individual tax rates, the tax effects for the employees are assumed to be constant across 
firms. Given that assumption, the tax cost to the firm is a linear function of the firm's estimated 
future marginal tax rate. The estimated future marginal tax rate is measured using an estimate of 
the current marginal tax rate as discussed in Shevlin (1990).? This procedure uses a simulation 
to estimate the present value of the change in taxes payable for an additional dollar of income 
earned in the current year. Because the tax cost only applies to ISOs, and ISOs were created in 
1981, the tax variable is multiplied by a dummy variable that is equal to 1 if the year is after 1980 
and 0 otherwise. 


Incentives 


Studies indicate that equity securities reduce the incentive costs arising from the separation 
of ownership and management (e.g., Haugen and Senbet 1981; Hagerty et al. 1990; and Hemmer 
1993). Equity securities provide managers with a stake in the value of the firm that reduces the 
manager’s consumption of perquisites and increases effort. 

In addition, Lewellen et al. (1987) suggest that ESOs offset differences in time horizon and 
risk aversion. Due to a finite employment period, a manager’s time horizon is likely to be 
relatively short. Managers are also likely to be less diversified (due to firm specific human capital) 
and more risk averse than owners. Stock options typically have a ten year maturity, and can 
lengthen the manager’s time horizon. Lambert et al. (1991) show that the convex relation between 
option value and stock price could offset the manager’s risk aversion. 

Studies such as Smith and Watts (1992) and Gaver and Gaver (1993) provide evidence of 
relations between the use of ESOs, the value of the firm’s growth opportunities, firm size and 
financial leverage. The findings with regard to the firm’s growth opportunities support past 
theoretical work that suggests ESOs encourage managers to undertake risky projects. Consistent 
with the empirical studies referred to above, the market to book ratio and the level of research and 
development expenditures are used as measure of the firm’s growth opportunities. 

Both firm size and leverage could serve as proxies for a number of different effects, such as 
the level of growth opportunities, the variance of cash flows, financing policy, and the cost of 
monitoring management. Although it is difficult to ascribe their effects to a specific cause, the past 
empirical regularity suggests that they should be included as control variables. 

I also include a measure of the proportion of the risk of the firm that is firm-specific. This 
variable (also discussed in Hagerty et al. 1990) is included to capture two effects. The first is the 
risk premium demanded by the manager and the second is the cost of monitoring management. 


23 The method used here is described as Series I in Shevlin (1990). 
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Finally, as discussed in Jensen and Meckling (1976), management ownership reduces the 
need to align managerial wealth to shareholder wealth. Therefore, the percentage of the firm’s 
stock held by insiders (as reported in the firm’s proxy statement) is included as a control variable. 


Liquidity and Year/Industry Effects 


The distribution of ESOs allows a firm to conserve cash, and firms receive cash upon the 
exercise of the option, working capital is included as a measure of liquidity. The regression also 
includes intercept dummy variables for each year and each of the four most common industry 
groups." The data in table 2 suggest the use of employee stock options tends to differ over time. 
In addition, past studies, such as Ely (1991) and Smith and Watts (1992), suggest compensation 
contracts differ across industries. 


Regression Equations 


The following regression examines the relation between the value of options granted by a 
firm in a given year and the perceived benefit of reporting higher levels of income. 


V, = Bo + ŽB, DYear + ¥B,,DSIC +B, NIA, + B, NIB, +B,,ACC, +B,,D81, TAX, + 


inda] | 


B, MKBK, t B4;RD, -- B4,RSK, +B,,SIZ,, + Ba LEV, +B,,LIQ, +B,,INSID, +E 


where the i subscript refers to the firm, t refers to the time period. The variables are summarized 
in table 3. 

The two test variables in this analysis are NIB (the extent to which as-if income is below a 
target level) and ACC (the extent to which a firm uses income increasing accounting methods). 
The predictions are (1) the lower the value of reported income relative to a target level of income, 
the greater the value of ESOs, i.e., B, is negative, and (2) the greater the firm's reliance upon 
income increasing accounting methods, the greater the value of ESOs, i.e., b,., is positive. 

Because the data set contains a significant number of observations with the dependent 
variable equal to zero, and the dependent variable cannot be negative, the above regression is 
estimated using Tobit as well as OLS (the OLS results are reported). In addition, because Maddala 
(1991) suggests that Tobit is not always appropriate in such circumstances, I conduct tests that 
assume a two-stage decision process. A firm first decides whether to grant options, and, if they 
decide to grant options, the firm determines the value of options to grant. Probit is used to estimate 
the first stage (the dependent variable is equal to 1 if the firm issued employee stock options and 
0 otherwise) and OLS is used to estimate the second stage (the value of options granted given that 
the firm issued options, i.e., observations with V=0 are excluded). 


Results 


The results for the pooled regressions are summarized in table 5. Because accounting rules 
restrict changes in accounting choices, the t-statistics are calculated using the larger of the OLS 
standard errors and the standard errors estimated using a method discussed in Froot (1989). The 
Froot technique adjusts for correlations caused by multiple observations for a given firm. The 
results in column 1 reflect OLS estimation of equation (3). Consistent with hypothesis 1, the 
estimated coefficient for the NIB variable is negative and significant at the 5 percent level, and, 


*The four industries represented are SIC code 28, chemicals and allied products (29 firms), SIC code 35, industrial, 
machinery and computer equipment (15 firms), SIC code 36, electrical and other electrical equipment (10 
firms), and SIC code 37, transportation equipment (11 firms). 
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TABLE 5 
Pooled Regressions of the Value of ESOs Granted Against Financial Reporting 
Cost and Control Variables 


10 14 
V, = By XB DYear + Y Bus DSIC+B,sNIA, + Bi NIB, «Bj ACC, +BuD81, TAX, + 
y= 
Pio MKBK, +B..RD, +B RSK, +B,SIZ, + B,,LEV,, +B,,1IQ, +B,,INSID, + €, 





Estimated Coefficient 
(t-statistic) 
Pred. Sign (1) (2) (3) 

NIA none 0.172 1.448 0.099 
(0.691) (0.982) (0.378) 
NIB — -0.299** -1.727 -0.266* 
(-1.822) (-1.185) (-1.299) 
ACC + 0.082** 0.838*** 0.049* 
(2.319) (3.976) (1.308) 

D8ITAX - -0.001 - 0.010 -0.001 
(-0.425) (1.534) (-1.131) 

MKBK * 0.131*** -0.017 0.178*** 

(3.847) (-0.149) (4.869) 

RD + 0.294 2.388 0.207 
(0.608) (1.054) (0.382) 

RSK none 0.074 -2.580*** 0.172 
(0.766) (-4.848) (1.555) 

SIZ none 0.611 »[5,775*** 4.487*** 

i (0.884) (-3.495) (2.745) 

LEV none -0.236 0.918** -0.074 
(-0.310) (2.279) (-0.901) 

LIQ ~ -0.012 0.209 -0.040 
(-0.161) (0.472) (-0.443) 

INSID — -0.007* -0.026 -0.006 
(-1.388) (-0.636) (-1.067) 

n 1,317 1,317 1,033 
adj. R 0.163 0.205 


Notes: A description of the variables can be found in Table 3. 
t-statistics are based on the higher of the OLS/Probit standard errors and the Froot-adjusted standard errors. 
(1) Dependent variable is the value of options granted per employee in constant dollars. The model is estimated 
using OLS. 
(2) Dependent variable is equal to 1 if the firm granted options in that year and 0 otherwise. The model is estimated 
using Probit. 
(3) Dependent variable is the value of options granted per employee in constant dollars. Observations with 
V=0 are excluded. The model is estimated using OLS. 
* Significant at p<0.10. ** Significant at p < 0.05. *** Significant at p < 0.01. 
Tests are one-tailed for variables with a predicted sign, and two-tailed otherwise. 
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consistent with hypothesis 2, the estimated coefficient for the ACC variable is positive and 
significant at the 5 percent level. The results derived from a Tobit estimation (not reported) were 
generally stronger (t-statistic for the coefficient on NIB was -2.173 and the t-statistic for the 
coefficient on ACC was 3.459). 

Column (2) of table 5 presents the results from a regression with the dependent variable equal 
to one if the firm granted ESOs and zero otherwise. The NIB variable is negative, as predicted, 
but not significant. However, the coefficient on the accounting choice variable, ACC, is positive 
and significant at the 1 percent level. 

The results of the OLS regression used to examine the value of options granted, given that 
options were granted, are in column (3) of table 5. The estimated coefficients on NIB and ACC 
are significant in the predicted directions at the 10 percent level. 

The significant negative coefficient on SIZ (the inverse of total assets) in regression (2) 
suggests that larger firms tend to grant options more frequently. However, the significant positive 
coefficient on the same variable in regression (3) suggests that smaller firms that use ESOs tend 
to have larger option grants than larger firms. Similarly, the extent of growth opportunities 
(represented by MKBK) is not significant in explaining whether a firm will issue options, but 
given that a firm has chosen to grant options, the variable is significant in explaining the value 
of options granted. 

To further evaluate the effects of possible inter-temporal dependence, I estimate the 
regressions, excluding the year dummy variables, for individual years. table 6 reports the results 
for the test variables. The estimated coefficient for NIB is negative, as predicted, in eight of the 
11 years, but significantly less than zero in only three of the 11 years.? The data in table 6 also 
weakly support a positive relation between ACC and the value of options granted. The estimated 
coefficient is positive in every year but significant in only four of the 11 years. 

Two additional tests were used to further investigate the relation between income (relative 
to target) and the value of stock options granted. The first uses a set of firm-specific regressions 
with V, as the dependent variable and NIA and NIB as the dependent variables. The t-statistics 
on the individual coefficients are then aggregated using the method outlined in Healy et al. 
(1987). The second testis a pooled regression with firm-specific dummy variables. In both tests, 
the estimated coefficient for NIB is negative, but not significantly different from zero at 
conventional levels. Although the small number of observations foreach firm reduces the power 
ofthese tests, the results weaken confidence in the conclusions from the main regressions reported 
in table 6. 

As noted previously, the use of the Black-Scholes model introduces two sources of error into 
the tests. Because managers are unable to fully diversify the risk of holding ESOs, risk-averse 
managers are likely to exercise their ESOs earlier than would be predicted by the dividend- 
adjusted Black-Scholes model. Therefore, the expected value of the ESO is likely to be lower than 
the Black-Scholes value, and the cash equivalent value of the RSO to a risk-averse manager is 
likely to be less than the expected value. 

The sensitivity,of the tests to the valuation of ESOs is examined by estimating the main 
regression (column 1 of table 5) under alternate assumptions. The first adjustment varies the 


3 A binomial test indicates that the probability of getting three successes in 11 trials, where the probability of success 
of 5 percent is less than .02. Note however, that this test assumes independence across trials and therefore does not cor- 
rect for the possible inter-temporal dependence. 

75 The aggregated t-statistic for the estimated coefficient of NIB computed on the full sample is significantly negative 
(-1.833). However, if the invidvidual firm t-statistics are winsorized to -7.0, the aggregated t-statistic is not significant 
(-1.016). 
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TABLE 6 
Annual Regressions of the Value of ESOs Granted Against Financial Reporting 
Cost and Control Variables 


4 
V, = Bo, + 2 Bia DSIC + B,NIA, + B,NIB, +B ACC, +B,D81, TAX, +B,MKBK,, +B o RD, + 
y= 
Bau RSK, +B,,SIZ,, +B a LEV, +B,,LIQ, +B,INSID,, + €, 


Year Bs Bs B n adj. R? 
pred sign none - + 

1979 1.131 0.203 0.081* 120. 0.221 
(1.445) (0.298) (1.560) 

1980 7.595*** =3:113%** 0.099 120 0.265 
(2.939) (-3.299) (1.117) 

1981 0.439 -3.704*** 0.027 123 0.362 
(0.661) (-3.676) (0.413) 

1982 -0.848 -0.214 0.083* 120 0.290 
(-0.702) (-0.701) (1.574) 

1983 -0.124 -1.050 0.075 121 0.014 
(-0.178) (-0.857) (0.692) 

1984 0.493 -0.971 0.184** 122 0.114 
(0.762) (-0.612) (1.931) 

1985 1.293 -1.041** 0.112 121 0.107 
(1.301) (-1.845) (1.253) 

1986 -0.795** 0.068 0.082 119 0.368 
(-2.131) (0.171) (1.230) 

1987 -0.509 -0.057 0.111* 119 0.126 
(-0.825) (-0.073) (1.422) 

1988 2 19]-*» -0.268 0.063 120 0.228 
(5.307) (-1.010) (1.031) 

1989 0.015 0.152 0.006 112 -0.040 
(0.048) (0.195) (0.066) 


Notes: A description of the variables can be found in table 3. 
* Significant at p<0.10. ** Significant at p< 0.05, *** Significant at p< 0.01. 
Tests are one-tailed for NIB and ACC and two-tailed for NIA. 


Matsunaga—The Effects of Financial Reporting Costs on the Use of Employee Stock Options 17 


assumption with regard to the manager's discount for risk. This test also assesses the sensitivity 
of the results to the potential mechanical correlation between V and NIB (I subtract the estimated 
value of the ESO grant from reported income to derive “as-if” income). The second adjustment 
examines the sensitivity of the results to changes in exercise patterns by reducing the life of the 
option from ten years to five years.”’ 

Table 7 presents the results from the sensitivity tests. For comparison purposes, the first row 
of table 7 replicates the results from the main regression. In that regression, T was set equal to ten 
(unless the firm disclosed that the options had a five year life) and the risk discount was set equal 
to 0. In rows two through four, the risk discount increases, until, in row four, NIB compares 
reported net income to target net income. The estimated coefficient and t-statistic for NIA rise and 
the estimated coefficient and t-statistic for NIB fall as the discount rate increases. Although this 
pattern occurs mechanically (NIA and NIB are increased by a percentage of the value of the ESO 
grant), with a 25 percent risk discount, Bi, is negative and significant at the 10 percent level. 

Rows five through eight of table 7 present the results of the regressions excluding ACC and 
leaving NIB as the only income management variable. The results are somewhat stronger 
suggesting that the two income management variables are measuring common effects (i.e., the 
short-term and long-term income management strategies are not independent). 

Rows nine through 12 of table 7 illustrate the effects of reducing the expected term of the 
options (i.e., setting T=5). Unlike the risk discount, this adjustment affects the calculation of the 
dependent variable as well as the independent variables. The results using T—5 are stronger than 
the results derived from setting T=10 for all risk discount levels. 

As noted earlier, the inventory accounting choice could proxy for differences in the 
investment opportunity set or tax status. To investigate the influence of the other accounting 
choices, ACC2 is defined as an accounting choice measure excluding the inventory accounting 
choice. I estimate the regression with ACC2 over the period ending in 1985 in which all of the 
remaining accounting choices were available (847 firm-year observations). The results (not 
reported) are consistent with the results reported in column 1 of table 5. The estimated coefficient 
on ACC2 is 0.080 and the Froot adjusted t-statistic is 1.721, whichis significant at the five percent 
level.” 


IV. FINANCIAL REPORTING AND THE CHOICE OF EQUITY SECURITY 


To test the hypotheses, it is necessary to separate the income management motive of ESO 
grants from the setting of optimal incentives. The preceding tests used independent variables to 
control for incentive effects. In this section, I control for incentives by comparing the use of ESOs 
to securities with similar return distributions, but different financial reporting implications, 
namely, SARs and tandem securities (which give the manager the choice of exercising a stock 
option or an SAR). In all cases the manager receives the appreciation in price from the grant date 


? As shown by Hemmer et al. (1994), setting t=5 overstates the expected value of an option with an expected life of five 
years. However, the true expected value depends upon the employee's exercise strategy (which is unobservable). 
Therefore I assume that differences in exercise strategy are randomly distributed across the sample firms. 

?*5 Two other sensitivity tests were conducted. To examine the effects of outliers, values of (NI-SO-TargetV Total Assets 
that were below the one percentile were winsorized to the one percentile value and values of V that exceeded the ninety- 
ninth percentile were winsorized to the ninety-ninth percentile value. To examine the effects of influential observations, 
observations identified as having an influential influence on NIB and/or ACC were excluded from the analysis. The 
influential observations were identified as having exceedingly high DFBETAs (i.e., > 6/7,) and resulted in the deletion 
of 11 observations. The results are qualitatively similar to the ones reported in column 1 of table 5 (i.e., signs of the 
coefficients and levels of significance are unchanged). 
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TABLE 7 
Examination of the Sensitivity of Results to Option Valuation Assumptions Regarding 
Time to Maturity and Option Valuation Discount 


10 14 
Vi = Bot + 2, B,DYear + 2 Bina PSIC + Bis NIA; + Big NIB, + Bj, ACC, + Big D8 Lir TAX; + 
y= = 
B, )MKBK,, + B5 RD, -- Bj; RSK; -- Bj, SIZ; + Ba LEV; +B,LIQ, +B .sINSID;, + €;, 


Row T Discount Bis Bis B; " adj R? 

(1) 10 0 0.172 -0.299** 0.082** 1,317 0.162 
(0.690) (-1.822) (2.319) 

(2) 10 25% 0.214 -0.245* 0.082*** 1,317 0.163 
(0.826) (-1.524) (2.328) 

(3) 10 50% 0.262 -0.197 0.082*** 1,317 0.163 
(0.958) (-1.223) (2.328) 

(4) 10 100% 0.377 -0.131 0.082*** 1,317 0.165 
(1.205) (-0.808) (2.341) 

(5) 10 0 0.185 -0.330** 1,317 0.156 
(0.750) (-2.010) 

(6) 10 25% 0.228 -0.275** 1,317 0.155 
(0.884) (-1.857) 

(7) 10 50% 0.275 -0.227* 1,317 0.155 
(1.013) (-1.402) 

(8) 10 100% 0.392 -0.160 1,317 0.158 
(1.255) (-1.036) 

(9) 5 0 0.164 -0.253** 0.063** 1,317 0.171 

(0.846) (-2.019) (2.316) 

(10) 5 25% 0.196 -0.211** 0.063*** 1,317 0.171 
(0.973) (-1.698) (2.330) 

(11) 5 50% 0.231 -0.174* 0.064*** 1,317 0.171 
(1.093) (-1.393) (2.337) 

(12) 5 100% 0.317 -0.122 0.064*** 1,317 0.173 
(1.324) (-0.972) (2.343) 


Notes: A description of the variables can be found in table 3. 

a Assumed time to maturity used to calculate the Black-Scholes value of the stock options. 

b percentage reduction in the value of options deducted to net income to estimate "as-if' net income. When the 
discount = O the full Black-Scholes value of the options granted is deducted from net income. When the 
discount = 100%, options are not deducted from net income (i.e., “as-if” income is equal to reported income). 

* Significant at p<0.10. ** Significant at p<0.05. *** Significant at p«0.01. 

Tests are one-tailed for NIB and ACC and two-tailed for NIA. 
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to the exercise date.? However, the securities differ with regard to transaction costs, financial 
reporting treatment and tax consequences.” 

The sample for the second set of tests begins with the prior sample of 123 firms. However, 
since this analysis examines the trade-offs between a specific set of securities, observations are 
only included if the firm granted stock options, SARs or tandem securities in a given year, and 
the type of security granted could be determined. 


Dependent Variable 


In this analysis, the dependent variable is the type of security issued by a firm in a year. 
Observations are coded as SO (unattached stock options) if the firm issued stock options and did 
not attach SARs to any of the options. Observations are coded as TNSO (tandem securities with 
NSOs) if the firm either (a) attached SARs to all options and only issued NSOs, (b) specifically 
attached SARs only to NSOs, but not to ISOs, or (c) issued SARs that were not attached to stock 
options. Observations are coded as TISO (tandem securities with ISOs) if the firm issued stock 
options under an ISO plan and attached SARs to some of those options. 

The income effects of TISOs depend upon whether the firm expects the employee to exercise 
the ISO or the SAR. As noted in section II, the employee trades off the tax benefits of the ISO 
against the lower transaction costs associated with the SAR. In the main tests that follow, TISOs 
are grouped with TNSOs, i.e., the dependent variable is set equal to one if the firm grants 
unattached options and 0 otherwise. Tests are then conducted to assess the sensitivity of the results 
to alternate classifications. 

Table 8 shows the number of observations for each year by security issued. Since ISOs were 
created in 1981, any SARs attached to stock options priorto 1981 were attached to NSOs. As ISOs 
gained popularity, fewer firms attached SARs to NSOs and more attached SARs to ISOs. Thus, 
both the number of firms and percentage of sample firms that issued SARs or attached SARs only 
to NSOs decline over time. 


. Independent Variables 


The test variable is the previously defined ACC variable. The extent the firm is under its target 
income level (NIB) is excluded because the comparison is between the financial reporting 
benefits of stock options relative to stock appreciation rights. Since price appreciation generally 
takes place over a number of years, the decision is more likely to reflect a Jong-term income 
strategy than an attempt to boost to income in a particular year. Granting ESOs instead of SARs 
would only have a material income effect if the firm experienced a substantial degree of price 
appreciation during the year. Such a firm would be unlikely to be under its target level of income. 

As in the previous analysis, I include intercept dummy variables for year and industry. The 
data in table 8 indicate that the choice of equity security varies over time. Other control variables 
include D81TAX (intended to capture the corporate tax cost of ISOs) and the Black-Scholes value 


?9'The incentive effects could differ across securities for two reasons. First, the difference in the tax treatment of the 
securities would likely lead to a difference in portfolio composition and after-tax returns. Second, during the sample 
period "insiders" are required to hold onto shares acquired from the exercise of stock options for six months. This six 
month holding period could result in different incentives between stock options and SARs. These two incentive 
differences are considered to be of second-order for the purposes of this study. 

9 Firms rarely issue unattached SARs. The test sample includes nine firms (33 firm-year observations) that issued SARs 
that were not attached to stock options. Since managers who are granted tandem securities with NSOs tend to exercise 
the SAR (thus saving transactions costs), unattached SARs are combined with tandem securities with NSOs in the 
empirical tests. 
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TABLE 8 
Frequency of Type of Security Granted 


Year SO TNSO TISO Total 
Pooled 411 155 171 737 
Sample (55.8%) (21.0%) (23.2%) 
1979 21 19 0 40 
(52.5%) (47.5%) 

1980 29 34 0 63 
(46.0%) (54.0%) 

1981 — 29 23 5 57 
(50.9%) (40.4%) (8.7%) 

1982 36 17 15 68 
(52.9%) (25.0%) (22.1%) 

1983 35 13 19 67 
(52.2%) (19.4%) (28.4%) 

1984 40 10 24 74 
(54.1%) (13.5%) (32.4%) 

1985 37 8 20 65 
(56.9%) (12.3%) (30.895) 

1986 31 7 23 61 
(50.896) (11.5%) (37.7%) 

1987 49 9 24 82 
(59.8%) (11.0%) (29.255) 

1988 54 7 27 88 
(61.4%) (7.9%) (30.7%) 

1989 50 8 14 72 
(69.4%) (11.1%) (19.5%) 


Notes: The percentage of firms in the sample issuing each security in a given year appears in parentheses below the 
number of firms. 


SO: Stock option issued without a stock appreciation right. 

TISO: Incentive stock option issued in tandem with a stock appreciation right. 

TNSO: Nonqualified stock option issued in tandem with a stock appreciation right, or a stock appreciation 
right issued without a related option. 
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of the option (a measure of the expected gain on the exercise of the option). The expected costs 
and benefits of issuing a tandem security depend upon the probability that the manager will elect 
to exercise the SAR instead of the stock option. That probability, as well as the tax differences 
between the securities, is a function of the expected gain on exercise. 

The remaining control variables are the firm's leverage (LEV), liquidity (LIQ), size(SIZ) and 
growth opportunities (MKBK). The leverage and liquidity variables are included to control for 
cash flow effects. At exercise, stock options provide a cash inflow, while SARs result in a cash 
outflow. The size variable captures differences in compensation agreements related to organiza- 
tional structure. Finally, the descriptive data presented in Clinch (1991) indicates that the choice 
of equity security 1s likely to vary with the firm’s growth opportunities, consistent with the notion 
that high growth firms use stock options as opposed to SARs to encourage stock ownership 
subsequent to exercise. 


Statistical Tests 


Hypothesis 2 predicts a positive relation between the probability a firm will issue a security 
that yields a lower expected level of compensation expense and the firm’s financial reporting cost. 
This prediction is tested using Probit estimation of the following pooled time-series, cross- 
sectional regression. 


10 14 
Security. Yo+ LYyDYear+ $, Y,4,DSIC + yj, ACC, +Y16D8ITAX; + y;;BSVj + 
y=] ind-11 
Yis LEVi + YioLIQie  Y258IZ4 + ¥21MKBK;, +i, (4) 
where, 
Security, = 1 if the firm issued SOs. 
=: 0 if the firm issued TISOs, TNSOs or SARs. 
BSV= Black-Scholes value of options (or SARs) granted computed as in equation (1). 
5,7 error term. 
all other variables are as previously defined. 


Hypothesis 2 predicts a positive relation between the probability a firm will use an income- 
increasing security and the firm's use of income-increasing accounting methods, i.e., Y,, is 
positive. 


Results 


Table 9 presents the results of the Probit estimation of regression equation (4) for ACC and 
ACC2 (accounting choices excluding inventory). The regression with ACC includes the pooled 
sample over the full time period, and the regression with ACC2 includes observations pooled over 
the sub-period in which the remaining three accounting choices were available (1979-1985). As 
in the prior tests, the t-statistics are calculated using the larger of the Probit standard errors and 
the Froot-adjusted standard errors. 

As predicted, the estimated coefficient on ACC (ACC2) is positive and statistically 
significant at the 1 percent (5 percent) level. To assess the sensitivity of the results to the 
classification of TISOs, I redefine the dependent variable to code TISOs as 1, i.e., assume that the 
firm expects the employee to select the ISO. The results (not reported) indicate that the 
classification has a minor effect on the results. The t-statistic for ACC (ACC2) is 3.028 (1.697), 
which is still significant at the 1 percent (5 percent) level. However, the t-statistic on the tax 
variable (D81TAX) becomes -2.249, which is significant at the 5 percent level. This result is 
consistent with the assertion that lower marginal tax firms are more likely to grant ISOs. 


v6? u5 
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TABLE 9 


Probit Regressions of Type of Security Granted Against 
Financial Reporting Cost and Control Variables 


10 14 
Security, = Yo + V, y,DYear-- Y, YmgDSIC+7,,ACC,, + Y,,D81TAX, + y,,BSV, + y, LEV, 
yel indell 


+ Y, LIQ, Y 4 IZ, +Y MKBK, 65, 
6 10 
Security, = $, -- 9,6, DYear + Y One 
y=l ind=7 
+O LIQ + 9,.SIZ,, +0; MKBK, 5, 


Estimated Coefficient 
(Asymptotic t-statistic) 
pred. sign 1979-1989 
ACC + 0.921*** 
(4.116) 
ACC2 + 
D81TAX none -0.004. 
(-0.518) 
BSV none 0.010 
(1.380) 
LEV + -0.585 
(-1.246) 
LIQ - 0.863 
(1.774) 
SIZ none 11.463*** 
(3.857) 
MKBK none 0.648*** 
(5.161) 
n 737 


Notes: A description of the variables can be found in table 3. 
* Significant at p<0.10. ** Significant at p«0.05. *** Significant at p<0.01. 
Tests are one-tailed for variables with a predicted sign and two-tailed otherwise. 


DSIC + 6,,ACC2,, + 6,,D81TAX, + 6,,BSV, +u LEV, 


1979-1985 


0.230** 
(2.166) 


0.004 
(0.413) 


0.014 
(1.324) 


-0.244 
(-0.327) 


1.056 
(1.530) 


47.516*** 
(3.331) , 


0.768*** 
(4.196) 


434 
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When the main regression is estimated using Probit with individual firm dummy variables, 
the results (not reported) indicate that the estimated coefficient for ACC is positive and significant 
at the five percent level (t-statistic = 2.325). However, the estimated coefficient for ACC2, though 
positive, is not significant at conventional levels (t-statistic = 1.037). 

When the main tests are estimated on an individual year basis, the results (not reported) 
indicate a fairly strong association between the use of unattached stock options and ACC. 
However, the association is substantially weaker without the inventory accounting choice. The 
estimated coefficient for ACC is positive in ten years and significant at the ten percent (five 
percent) level in five (six) of the 11 years. The relation is significant during each year in the 198 1— 
1986 period (the period in which the tax benefits of ISOs were greatest) and not significant in all 
other years. In contrast, the estimated coefficient on ACC2, though consistently positive, is not 
statistically significant in any individual year. l 

The test results provide weak evidence of a positive association between the use of income- 
increasing accounting methods and the use of income-increasing equity securities. Firms that use 
. income-increasing accounting methods are more likely to issue stock options without attached 
SARs. However, the strength of the association depends upon the inventory accounting choice. 


V. CONCLUSION 


The results from this study are generally consistent with the hypothesis that the current 
financial accounting rules pertaining to employee stock options (ESOS) affect the compensation 
practices of some firms. The pooled regression results support a positive relation between the use 
of ESOs and the firm's reliance on income-increasing accounting methods, and a negative 
relation between the extent a firm is below its target income level and the use of ESOs. Though 
the pooled results are statistically significant at conventional levels, they are likely to be inflated 
by inter-temporal dependence. Alternate methods, such as the aggregation of firm-specific 
t-statistics and the use of firm-specific dummy variables, fail to detect significant relations. In 
addition, the annual tests suggest that the results are not consistent across individual years in the 
sample period. 

There are also several important limitations to the study. First, the test variables, NIB and 
ACC, might proxy for correlated omitted variables or act as an instrumental variable for tax and 
cash flow effects. For exemple, given the tax benefits of LIFO, FIFO firms might below tax rate 
firms that have less available cash. Therefore the observed statistical relation between the use of 
_ stock options and ACC could reflect conservation of cash rather than income management. The 
inost significant accounting choice appears to be the inventory accounting choice, which has been 
shown to differ by industry and firm characteristics (see Skinner 1993). 

Second, the negativerelation between the use of employee stock options and the extent a firm 
is below its target level of income is, to some degree, mechanical, resulting from the computation 
of “as-if” (or unmanaged) income. Third, the observed association might also be consistent with 
an “incentive” explanation, whereby income acts as a signal as to whether incentives are properly 
aligned. 

The importance of the ESO reporting issue and the inconsistency of the results indicate the 
need for further research along several branches. First, the sample used in this study contains firms 
that are larger and more successful than the average firm. Much of the debate on the FASB 
Exposure Draft revolves around the financial reporting impact on smaller, younger firms, and 
future research might focus on such firms. Second, refinements in the valuation of ESOs, both 
from the perspective of measuring a “fair value” and a cash equivalent value for the options, could 
increase the power and reliability of tests of ESOs and earning management. Finally, future 
research can develop different measures of a firm's financial reporting benefits. Such refinements 
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appear to be necessary in order to determine whether the accounting for ESOs has a significant 
impact on their use as a component of management compensation. 


APPENDIX 

List of Firms in Final Sample 
1. Abbott Laboratories 41. Dexter Corp. 
2. Aluminum Co. of America 42. Du Pont (E.L) de Nemours 
3. American Brands 43. Eastern Co. 
4. American Maize Products 44, Eastman Kodak Co. 
5. Ametek Inc. 45. Eaton Corp. 
6. AMP Inc. 46. Edo Corp. 
7. Armstrong World Ind. 47, Ethyl Corp. 
8. Atlantic Richfield 48. FMC Corp. 
9. Avon Products 49. Federal Signal Corp. 
10. Badger Meter Inc. 50. Ferro Corp. 
11. Ball Corp. 51. Ford Motor Co. 
12. Bally Mfg. Corp. 52. Foster Wheeler Corp. 
13. Bandag Inc. 53. General Electric 
14. Bard (C.R.) Inc. 54. Gillette Co. 
15. Barry (R.G.) 55. Goodrich (B.F.) Co. 
16. Bausch & Lomb Inc. 56. Goodyear Tire & Rubber 
17. Bemis Co. 57. Great Lakes Chemical 
18. Bethlehem Steel 58. Guardsman Products 
19. Boise Cascade Corp. 59. Harsco Corp. 
20. Borden Inc. 60. Hasbro Inc. 
21. Brown & Sharpe Mfg. 61. Hershey Foods 
22. Brunswick Corp. 62. Hilton Hotels Corp. 
23. Brush Wellman Inc. 63. Illinois Tool Works 
24. CBI Industries 64. IBM Corp. 
25. CMI Corp. 65. Intl. Flavors & Fragrances 
26. CPC International 66. Kellogg Co. 
27. CTS Corp. 67. Kenwin Shops 
28. Carlisle Cos. Inc. 68. Kerr-McGee Corp. 
29. Caterpillar Inc. 69. Kollmorgen Corp. 
30. Chrysler Corp. 70. Lilly (Eli) & Co. 
31. Cincinnati Milacron Inc. 71. Lockheed Corp. 
32. Clark Equipment Co. 72. Martin Marietta Corp. 
33. Colgate Palmolive Co. 73. Merck & Co. 
34. Cooper Industries. 74. Minnesota Mining & Mfg. Co. 
35. Cooper Tire & Rubber 75. Monsanto Co. 
36. Corning Inc. 76. NCR Corp. 
37. Crane Co. 77. NL Industries 
38. Crompton & Knowles 78. Nalco Chemical 
39. Cummins Engine 79. Nashua Corp. 


40. Del Laboratories 80. Occidental Petroleum 


81. Ohio Art Co. 


82. Owens Corning Fiberglass 


83. PPG Industries 
84. Pfizer Inc. 
85. Phelps Dodge 


86. Phillips Petroleum Co. 


87. Pitney Bowes Inc. 
88. Polaroid Corp. 

89. Portec Inc. 

90. Pratt & Lambert Inc. 
91. Raytech Corp. 

92. Raytheon Co. 

93. Reynolds Metals Co. 
94. Rhone-Poulenc Rorer 


95. Robertson Ceco Corp. 


96. Rogers Corp. 

97. Schering-Plough 
98. Sherwin-Williams 
99. Smith (A.O.) Corp. 
100. Square D. Co. 
101. Stanley Works 
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103. TRW Inc. 

104. Textron Inc. 

105. Thomas & Betts Corp. 
106. Thomas Industries 

107. Timken Co. 

108. Titan Corp. 

109. Trinova Corp. 

110. USG Corp. 

111. Union Carbide Corp. 

112. Unisys Corp. 

113. United Industrial Corp. 
114. United Technologies Corp. 
115. Upjohn Co. 

116. Vulcan Materials Co. 

117. Warner-Lambert Co. 

118. Wean Inc. 

119. Weis Markets Inc. 

120. Western Co. of No. America 
121. Westinghouse Electric Corp. 
122. Weyerhaeuser Co. 

123. Witco Corp. 


102. Superior Industries 
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ABSTRACT: Experienced auditors tend to structure their knowledge of financial 
statement errors with audit objective as the primary organizing dimension and 
transaction cycle as secondary. Yet, many audit tasks are structured In the opposite 
manner, requiring auditors to assess whether objectives are met for each transaction 
cycle. Our paper reports the results of an experiment which indicates that this 
mismatch between knowledge structure and task structure may hinder auditors’ 
ability to draw on previous experiences when making conditional probability judg- 
ments and when allocating audit hours to various objectives within cycles. These 
results suggest one Instance where knowledge structures that are often functional 
may have adverse effects when they do not match the task structure to which they 
are applied. 


Key Words: Knowledge structure, Error frequencies, Probability judgments, 
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| I. INTRODUCTION 


uditors’ organization of knowledge in memory (their "knowledge structure") is often . 
viewed as one of the keys to effective decision performance (see Bédard and Chi 1993; 
Libby and Luft 1993; Libby 1994; Smith and Kida 1991). Experienced auditors tend to 
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structure their knowledge of financial statement errors with audit objective as the primary 
organizing dimension and transaction cycle as secondary (Frederick et al. 1994), yet audit tasks 
are usually structured with transaction cycle as the primary organizing dimension and audit 
objective as secondary (Arens and Loebbecke 1991; Ernst & Young 1990; KPMG Peat Marwick 
1993). We investigate whether this difference between auditors’ knowledge structures for 
financial statement errors and the structure of audit planning tasks adversely affects auditors’ 
ability to access and use previously experienced error frequencies when making conditional 
probability judgments and audit planning decisions. This issue is important because it indicates 
that knowledge structures developed through experience may not always enhance audit deci- 
sions, implying a need to weigh both the positive and negative effects of alternative knowledge 
structures when considering interventions designed to either communicate structure to novice 
auditors or augment the decisions of experienced auditors. 

The audit planning process typically requires auditors to allocate audit effort across tests that 
are designed to satisfy various audit objectives for each transaction cycle. The probability that an 
audit objective is violated depends on the transaction cycle in question (e.g., validity errors occur 
more often in the sales and receivables cycle, while completeness errors occur more often in the 
acquisitions and payments cycle). Given that auditors assess the quality of accounting systems 
and internal controls at the transaction cycle level (Arens and Loebbecke 1991), it is natural to 
condition this probability judgment on transaction cycle. Knowledge of the frequency with which 
errors have occurred under similar circumstances would be useful when making such conditional 
probability judgments. However, prior research in psychology indicates that frequency knowl- 
edge is difficult to access and apply to a probability judgment that is conditioned on a dimension 
which is different from the primary dimension used to organize knowledge (Gavanski and Hui 
1992; Sherman et al. 1992). Thus, given auditors’ objective-dominant knowledge structures, 
auditors' probability judgments and audit planning decisions may reflect frequency information 
when conditioned on objective (the dominant dimension), but not when conditioned on cycle (the 
secondary dimension). Because the allocations of audit effort that result from these conditional 
probability estimates determine to a large extent whether over- or under-auditing occurs, they 
have an important influence on audit effectiveness and efficiency. 

To examine this issue, we performed an experiment in which auditors acquired error 
frequency knowledge through experience, estimated probabilities of error ofthe form P(objective 
error | cycle) or P(cycle error | objective), and allocated hours of audit effort within various 
transaction cycles and audit objectives. The auditors also performed sort tasks and estimated 
unconditional frequencies to provide data to identify the dominant dimension of their knowledge 
structures and to rule out potential alternative explanations for results. 

Results were consistent with expectations. The sort data indicated that objective was the 
dominant feature in auditors' knowledge structures. Consistent with this objective-dominant 
knowledge structure, auditors' estimates of conditional frequencies of the form P(cycle error | 
objective) were influenced by the frequencies presented to them in the experiment, while their 
estimates of conditional frequencies of the form P(objective error | cycle) were not. Allocations 
of audit hours likewise were influenced by experimental frequencies when auditors distributed 
hours across cycles for each objective, but not when they distributed hours across objectives for 
each cycle. Additional data were used to rule out alternative explanations for the results. The 
auditors estimated unconditional frequencies equally well for both cycle and objective, indicating 
that the results cannot be explained by differential frequency learning or by auditors merely 
failing to condition their probability estimates. Also, analyses based on another group of auditors’ 
estimates of real-world frequencies indicated that the results cannot be explained by interference 
from pre-existing frequency knowledge. 
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These results imply that the knowledge structures which auditors develop through experience 
may hinder application of experienced frequencies to audit decisions in audits organized on a 
cycle basis. More generally, the results indicate that knowledge structures developed through 
experience or conveyed by training may either facilitate or inhibit auditors’ application of 
experienced frequencies to audit decisions, depending on the degree to which knowledge 
structure matches task structure. 

The rest of this paper proceeds as follows. Section II describes related literature in accounting 
and psychology and develops our hypotheses. Section III presents the method we used to test the 
hypotheses. Section IV presents our results. Section V provides discussion of the results and 
suggests directions for future research. 


IL RELATED LITERATURE AND HYPOTHESIS DEVELOPMENT 


The Effect of Knowledge Structure on Conditional Probability Estimation 


Previous research in psychology has shown that people estimate conditional probabilities by 
accessing “natural sample spaces” in their knowledge structures (Gavanski and Hui 1992). A 
natural sample space is one which is likely to be accessed spontaneously and corresponds to the 
primary dimension of the knowledge structure a person has in his or her mind. For example, 
auditing textbooks generally portray financial statement errors as categorized primarily accord- 
ing to transaction cycle, and secondarily according to audit objectivé (e.g., Arens and Loebbecke 
1991). Figure 1 depicts such a structure.' For an auditor with this knowledge structure, each of 
the transaction cycle categories forms a natural sample space. If asked to estimate the probability 
that a given cycle will produce an error violating the validity objective (as opposed to some other 
objective), that auditor would find it relatively easy to access a cycle category and compare the 
relative numbers of validity errors to errors violating other objectives for that cycle. The natural 
sample spaces provided by this auditor’s knowledge structure are consistent with, and therefore 
facilitate, judging P(validity error | sales and receivable cycle), or judgments of any other 
conditional probabilities of the form P(objective error | cycle). 

In contrast, suppose an auditor has the knowledge structure pictured in figure 2, in which 
errors are categorized primarily according to audit objective, and secondarily according to 
transaction cycle. In estimating P(validity error | sales and receivable cycle), the auditor 
possessing this knowledge structure must access and compare the number of sales and receivables 
errors that violated the validity objective to the numbers of sales and receivables errors that 
violated all other objectives, which requires that several primary and secondary categories be 
identified and accessed, rather than only several secondary categories within one primary 
category. The natural sample spaces provided by this knowledge structure are not consistent with 
judgments of P(validity error! sales and receivable cycle) (or judgments of any other conditional 


! Figures 1 and 2 describe alternative dominance relations among dimensions of a knowledge organization, rather than 
the fundamental cognitive structures used to achieve those dominance relations. A number of fundamental theories are 
consistent with the dominance relations pictured in these organizations. For example, exemplar models view categories 
as comprised of previously encountered instances, with some features more-important than others for determining 
categorization (see, e.g., Estes et al. 1989). Prototype models view categorization as occurring through comparison with 
a category prototype that typifies the most important features of the category (see, ¢.g., Rosch 1978). Connectionist 
models view categorization as determined by the associations between categories and their features, with some features 
having stronger associations than others (see, e.g., Gluck and Bower 1988). At some level many of these models are 
indistinguishable (Barsalou 1990; Estes 1986). We make no attempt to distinguish among alternative fundamental 
cognitive theories in this paper. Instead, we concentrate on the influence of knowledge organizations that are consistent 
with a variety of fundamental cognitive theories. 
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probabilities of the form P(objective error | cycle)), and thus may hinder retrieval of prior 
experience for use in the estimation process. Prior psychology research indicates that estimates 
of conditional probabilities that require accessing such unnatural sample spaces reflect experi- 
enced frequencies much less than do estimates that access natural sample spaces (Gavanski and 
Hui 1992). In fact, subjects making estimates that require them to access unnatural sample spaces 
often appear to inappropriately access a related natural sample space instead (Sherman et al. 
1992)? For example, auditors with an objective-dominant knowledge structure might tend to 
provide P(cycle error ! objective) when P(objective error | cycle) was requested. 

Note that the knowledge structure depicted in figure 1 (figure 2) is inconsistent (consistent) 
with judgments of P(cycle error | objective). What determines consistency is whether the 
knowledge structure organizes primarily on the same dimension that conditions the probability 
judgment. 


Experienced Auditors’ Knowledge Structures 


Tubbs (1992) demonstrates that audit objective becomes increasingly salient as auditors gain 
experience, and Frederick et al. (1994) demonstrate that experienced auditors use objective as the 
primary dimension by which they sort financial statement errors. Àn objective-dominant 
organization discriminates between errors based on cause (e.g., validity errors occur because an 
amount is included in the financial statements that should not be included), which is a naturally- 
occurring organization in a variety of contexts (Lien and Cheng 1990). This organization may 
serve a variety of important functions in auditing. For example, its causal orientation may 
facilitate identification of tbe source of an error or the control procedure necessary to prevent or 
detect the error (Tubbs 1992). Also, because many substantive tests are based on general testing 
techniques that apply to objectives irrespective of cycle (e.g., vouching to examine records for 
validity errors), an objective-dominant organization may facilitate the design of audit tests (Arens 
and Loebbecke 1991). Thus, it may by very useful for an objective-dominant structure to develop 
with experience, because it suits many of the task requirements faced by auditors (Anderson 1990; 
Murphy and Medin 1985). 

However, an objective-dominant knowledge structure is inconsistent with audit decisions 
that require estimating P(objective error | cycle) from experience, because these decisions require 
the natural sample spaces that are provided by a cycle-dominant knowledge structure. For 
example, one audit decision that requires auditors to estimate these conditional probabilities is the 
allocation in audit planning of the effort to be expended in various audit tests. This decision 
requires auditors to consider the probability that various audit objectives will be violated and, 
given that auditors assess the quality of accounting systems and internal controls at the transaction 
cycle level (Arens and Loebbecke 1991), it is natural to condition this probability judgment on 
transaction cycle (that is, to judge P(objective error | cycle)). 

Knowledge of previously-experienced error frequencies is useful in estimating these condi- 
tional probabilities. Prior research indicates that auditors learn error frequencies from experience 
(Ashton 1991; Butt 1988; Libby 1985; Libby and Frederick 1990; Nelson 1993, 1994), that error 
frequencies are sometimes considered consciously by auditors when making probability judg- 
ments (Waller 1990), and that error frequencies may also influence auditors' judgments in ways 
that are not conscious, e.g., by influencing the availability of potential hypotheses in memory 
(Libby 1985). However, no prior studies have examined the influence of frequency information 


? This behavior is not merely the result of semantic confusion over the meaning of a conditional probability, because 
subjects asked to estimate conditionals that are consistent with their knowledge structure do not make similar errors. 
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on judgments of conditional probabilities or the degree to which knowledge structure enhances 
or diminishes this influence. 

This lack of research is a concern because the inconsistency between auditors’ objective- 
dominant knowledge structure and the cycle-dominant structure of audit planning may result in 
probability estimates that do not reflect experienced frequencies. Note that this inconsistency 
does not merely imply that auditors' decisions will reflect an unconditional probability estimate 
instead of a conditional probability estimate (e.g., reflecting P(validity error) rather than 
P(validity error 1 sales and receivables cycle)). Rather, the inconsistency between knowledge 
structure and task structure implies that auditors will attempt to estimate the conditional 
probability and fail, such that audit planning decisions do not reflect previously experienced 
frequencies. 

Two aspects of the accounting context may lessen the degree to which knowledge structure 
influences auditors' ability to draw on their experience with error frequencies in estimating 
conditional probabilities. First, prior research in psychology suggests that human judgment is 
highly adaptive and responsive to the needs of decision contexts (Anderson 1990). The prior 
research demonstrating that knowledge structure can hinder estimation of conditional probabili- 
ties examined subjects' responses to unfamiliar problems in the laboratory, and so allowed no 
contextual adaptations to take place. Given the importance of estimating P(objective errori cycle) 
in the auditing context, experienced auditors may have developed knowledge structures that 
enable them to apply their frequency knowledge to the estimation process. For example, although 
prior accounting research indicates that objective is the more dominant dimension in experienced 
auditors' knowledge structures, the same research demonstrates that auditors sort on either cycle 
or objective with high accuracy when instructed to do so, suggesting that both dimensions are well 
developed in auditors' knowledge structures (Frederick et al. 1994). It may be that these two 
- dimensions are close enough in dominance to avoid hindering auditors’ judgments of probabili- 
ties conditioned on either cycle or objective. 

Second, the results of prior psychology studies suggest that the effect of knowledge structure 
on the estimation of conditionals diminishes with the salience of the features that distinguish 
categories (Sherman et al. 1992). Prior studies have used either simple pictorial stimuli (e.g., the 
shapes of alien faces in Gavanski and Hui 1992) or highly discriminable natural categories (e.g., 
gender, marital status, introvert/extrovert in Sherman et al. 1992; social or occupational 
categories in Gavanski and Hui 1992). Yet, even for these categories, a decrease in the 
significance of results was apparent for categories that have distinguishing features that are less 
salient. For example, Sherman et al. (1992) attribute their finding that results for the gender 
category were stronger than results forthe marital status and introvert/extrovert categories to prior 
suggestions by Rothbart and Taylor (in press) and Smith and Zarate (1992) that categories related 
to physical differences are particularly discriminable. The concepts of interest in the auditing 
context are not defined by obvious characteristics such as physical differences, so it is possible 
that no influence of knowledge structure on conditional probability estimation will be detectable. 

Therefore, to determine the effect of auditors' knowledge structures on their ability to apply 

prior experience to conditional probability estimates, we tested the following hypothesis: 


H1: Auditors' estimates of P(cycle error! objective) are influenced by experimental frequencies 
more than are auditors' estimates of P(objective error | cycle). 


It is also necessary to test the degree to which the effects of knowledge structure on 
conditional probability estimation extend to audit decisions, because the manner in which audit 
decisions are made may mitigate the influence of knowledge structure. Specifically, during audit 
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planning an auditor distributes planned audit effort across audit objectives for each transaction 
cycle. Recall that one explanation for the influence of knowledge structure on estimation of 
conditional probabilities is that conditioning on the dominant (nondominant) feature of the 
knowledge structure facilitates (inhibits) the identification of category members and the retrieval 
of previously experienced instances of those category members. To the degree that providing the 
names of categories in the audit program performs this identification for auditors, the influence 
of knowledge structure on planning decisions may decrease. Thus, to determine whether the 
effects of knowledge structure extend to decisions, the following hypothesis is tested: 


H2: Auditors’ allocations of audit effort across cycles for each objective are influenced more by 
experimental frequencies than are auditors’ allocations of audit effort across objectives for 
each cycle. 

The following section describes the experiment used to test H1 and H2. 


IIl. METHOD 
Subjects 


Subjects were 47 auditors from a single Big 6 firm who were enrolled in a staff training 
session. The auditors' average experience was 3.26 years. The results of Frederick et al. (1994) 
suggest that this is sufficient time to have developed a knowledge structure in which audit 
objective serves as the primary basis for sorting financial statement errors. Thus, these auditors 
are expected to form natural sample spaces for estimating conditional probabilities of financial 
statement errors on the basis of audit objectives rather than transaction cycles. The auditors were 
required to participate in the experiment as part of their training activities. 


Overvlew, Design, and Procedures 


Overview and Design 

Subjects completed all portions of the experiment on Macintosh computers in special break- 
out rooms configured for computerized instruction. The computerized procedure standardized 
the timing of frequency presentations, provided immediate feedback during those instruction 
sequences that required it, and facilitated randomization of items and treatments (to be discussed 
later) and data recording. Subjects were prohibited from using reference materials or speaking 
with each other throughout the experiment. An experimenter was available at all times to 
supervise data collection and answer questions. 

Table 1 summarizes the experimental procedures. In the experiment, subjects completed a 
knowledge structure pretest, observed individual presentations of financial statement errors 
designed to convey frequencies, and completed a distractor task, conditional probability estima- 
tion test, conditional audit decision task, unconditional frequency estimation test, unconditional 
decision task, and a debriefing questionnaire. The experiment required an average of 41.25 
minutes to complete. 

The type of conditional probability estimated by the auditors (either P(cycle error | objective) 
or P(objective error ! cycle)) was manipulated between-subjects, because: (1) within-subjects 
manipulation might encourage subjects to relate their responses to the two types of conditionals, 
and (2) within-subjects manipulation might increase subjects' fatigue by increasing the number 
of judgments that each subject had to make. Whether the unconditional frequency estimation test 
Occurred before or after the unconditional decision task was also manipulated between-subjects 
to provide exploratory data for another study. Subjects were randomly assigned to one of the four 
combinations of these between-subjects factors and proceeded as follows. 
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TABLE 1 
Sequence of Experimental Procedures 


1. Free-sort nine individual errors 

2. Presentation of nine individual errors in varying frequencies (total of 49) 

3. Distractor task (4 ability questions) 

4. Conditional probability questions (2 questions conditioned on either cycle or objective) 


5. Conditional audit decision task (3 questions allocating hours across cycles for each objective or across 
objectives for each cycle) 


6. Unconditional frequency estimation task (6 questions estimating frequency of error for each cycle and 
objective) 


7. Unconditional audit decision task (2 questions allocating hours across cycles and across objectives) 
8. Directed sort by cycles 
9. Directed sort by objectives 


10. Demographic questions 


Note: Order of 6 and 7 manipulated between subjects; order of 8 and 9 determined randomly. 





Materials and Procedure 

Sort tasks are commonly used in cognitive psychology to determine how objects are 
classified? To determine whether our subjects possessed the objective-dominant knowledge 
structure observed by Frederick et al. (1994), subjects were asked to sort nine financial statement 
errors into categories on the basis of “how they thought the errors best go together.” This “free 
sort” task thus requires subjects to select some primary organizing dimension. Subjects were 
required to sort the errors into three groups to hold constant the level of detail in their sorts, thus . 
facilitating data analysis (discussed in section IV). To accomplish the free sort task, subjects 
clicked on either a “1,” “2,” or “3” beside each error to indicate how they would group the errors. 
The order of presentation of the nine errors in the free sort task was randomized and held 
constant across subjects. Subjects could change their groupings at any time during the sort task, 
and were prompted before proceeding to reconsider their answers. The free sort task is shown in 
appendix A. 

The nine financial statement errors used in the free sort task and the remainder of the 
experiment were selected to fully cross three transaction cycles with three audit objectives. The 


3 A less-direct method commonly used in the accounting literature is examining the clustering of free recalls (e.g., Choo 
and Trotman 1991; Weber 1980). Sec Chi et al. (1982) for a classic study employing a sort-task methodology, and 
Bédard and Chi (1993) for a discussion of the advantages of using this technique in auditing. 
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transaction cycles and audit objectives used were determmed by examining practice manuals 
from several large public accounting firms to identify those transaction cycles and audit 
objectives for which firms demonstrate the highest agreement in categorizing financial statement 
errors. The results of Frederick et al. (1994) were then used to identify errors for which there was 
the highest agreement on both cycle and objective category membership among audit managers 
affiliated with the firm providing subjects for this study. The transaction cycles, audit objectives 
and financial statement errors resulting from this selection process are shown in table 2.4 These 
cycles, objectives and errors were the same as those used in Bonner et al. (1993). 

After completing the free sort task, subjects viewed presentations of the nine individual 
financial statement errors in varying frequencies, for a total of 49 presentations. A different 
random order of the 49 presentations was used for each subject. To avoid primacy and recency 
effects, the random orders were determined with the constraints that: (1) each repetition of an error 
was separated by at least one other error, and (2) the three highest frequency errors were presented 
as Often in the first half of the sequence as in the second half. The frequencies of individual errors 
were determined by randomly assigning the three transaction cycles and the three audit objectives 
shown in table 2 to the rows and columns shown in table 3 (which describes the presentation 
frequencies) for each subject.” Subjects were instructed that they would be presented with a series 
of errors found during the audits of medium-sized manufacturers by a major accounting firm 


* 'The names of the objectives and cycles were the same as those used by the firm providing subjects. 

5 The results of this experiment should not be influenced by any frequency knowledge that auditors possessed prior to 
the experiment. Prior research indicates that frequency knowledge is time-tagged in memory (Hintzman and Block 
1973; Hintzman et al. 1973; Reichardt et al. 1973), such that experimenta! frequencies can be discriminated from pre- 
existing frequency knowledge (Butt 1988; Nelson 1993). Also, the random assignment of cycles and objectives to 

experimental frequencies should ensure that any influence of pre-existing frequency knowledge is spread across 
treatments. As discussed further in the results section, we gathered data from another group of auditors which confirmed 
that results were not influenced by subjects’ pre-existing frequency knowledge. 


TABLE 2 
Financial Statement Errors 
Transactions Cycle 
Audit Objective Sales Inventory/Purchases Investments 
Next period’ssales were Raw materials were im- Purchases of treasury | 
Proper Cutoff included in the current properly shown as re- bills wererecordedinthe 
year's revenue and re- ceived after year-end. wrong fiscal period. 
ceivables 
| Billings to legitimate More finished goods  Fictitious investments 
Validity customers were booked were recorded as re- were included in the ac- 
twice. ceived than were actu- count balance. 
ally received. 
The bad debt expense Obsolete inventory was Marketable securities 
Valuation and allowance were un- not written down to net were not reduced to 
derestimated. realizable value. lower of cost or market. 
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during the last quarter of 1992, and that their task was to remember the errors to the best of their 
ability. No mention was made of frequency learning, or of the tasks that subjects would perform 
after the error presentations. The subjects were also told that each error would remain on the 
computer screen for ten seconds. The structure and content of this frequency presentation task is 
the same as that used by Bonner et al. (1993), and similar to that used by Butt (1988).$ 

After viewing the individual error presentations, subjects answered four questions from the 
ability test used by Bonner and Walker (1994) as a distractor task to clear short-term memory. 
Next, subjects were asked to imagine that they were auditors who were working for the firm from 
which the list of errors was obtained, and to assume that they had been exposed to all of those 
audits, such that the errors constituted their experience. Then the subjects performed the 
conditional probability estimation test. | 

The conditional probability estimation test required judgments of two conditional probabili- 
ties, with the specific type of conditional depending on the assignment of the subject to either 
P(cycle error | objective) or P(objective error | cycle) at the beginning of the experiment. Subjects 
assigned to the P(cycle error l objective) task estimated conditionals of the form P(high-frequency 
cycle | low-frequency objective) and P(low-frequency cycle | high-frequency objective), while 
subjects assigned to the P(objective error | cycle) task estimated conditionals of the form P(high- 
frequency objective | low-frequency cycle) and P(low-frequency objective | high-frequency 
cycle). Only these conditionals were estimated because they differ the most in the conditional 
probability implied by the frequencies presented to subjects (1/7 for Plow | high), 4/7 for P(high 
| low)), and thus allow detection of an effect without a large number of estimates. The identity of 
the high- or low-frequency cycles and objectives referenced in the conditionals depended on the 
random assignment at the beginning of the experiment of the errors in table 2 to the columns and 
rows of table 3. Subjects answered each question by moving a pointer from the "zero" endpoint 
of a sliding scale numbered from zero to 100 to a position that indicated their probability estimate. 
The order ofthe two questions (P(high | low) and P(low | high)) was randomized between subjects. 
An example of a conditional probability estimation test question is shown in appendix B. 


$ The differences between the presentations in this experiment and Butt (1988) are minor; e.g., presentations are by 
computer rather than slide projector, exposure durations differ slightly. 


TABLE 3 
Individual and Category-Level Frequencies of Errors 


Transaction Cycle 
Audit Objective Total — 
A B C Objective Category 
I l 2 4 7 
H 2 4 8 14 
ii 4 8 16 28 
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After completing the conditional probability estimation test, subjects viewed an instruction 
screen, and then completed the conditional audit decision task. The conditional audit decision task 
asked subjects to assume that they were planning the audit of a medium-sized manufacturing 
company that is similar to the manufacturing companies whose errors they saw previously in the 
study, and required subjects to perform three different allocations of 100 hours of audit effort.’ 
A subject assigned to P(objective error | cycle) (P(cycle error | objective)) for the conditional 
probability estimation test allocated 100 hours of audit effort across the three audit objectives 
(transaction cycles) used in the experiment and an “other” category for each of the three cycles 
(objectives), thus performing three 100 hour allocation tasks. For each allocation task, subjects 
moved a pointer from the “zero” endpoint on a sliding scale with endpoints of zero and 100 to 
indicate the hours of effort they would budget for each category, and were not allowed to proceed 
unless they had allocated all 100 hours. The order of the three allocation tasks and the order of 
categories within each task were randomized between subjects, with the constraint that the 
“other” category always appeared fourth. An example of a conditional audit decision task 
question is shown in appendix C. 

After performing the conditional probability estimation test and the conditional audit 
decision task, subjects answered six unconditional frequency estimation questions (one for each 
of three cycles and three objectives) and performed two unconditional audit decision tasks (one 
allocating across cycles, the other across objectives). The unconditional questions elicited 
marginal category frequencies (7, 14 or 28, see table 3). They were the same frequency tests and 
audit decision tasks used in Bonner et al. (1993). The unconditional frequency estimation test 
provided data to determine that subjects had attended to the experimental frequencies. Subjects 
answered each question by moving a pointer from the “one” endpoint of a sliding scale numbered 
from one to 34 to a position that indicated their frequency estimate. These endpoints were chosen 
to range six below and six above the minimum and maximum actual category frequencies to 
insure that answers were not obvious. The order of the two unconditional frequency estimation 
tests (cycle or objective) and the order of the three cycle or objective questions within each test 
were randomized between subjects.? The unconditional audit decision task provided exploratory 
data unrelated to this study. Except with respect to determining whether subjects learned 
experimental frequencies, the results from the unconditional tasks will not be discussed further 
in this study. 

Following the unconditional frequency test and unconditional audit decision task, subjects 
performed two additional sort tasks. These sort tasks were identical to the free sort task at the 
beginning of the study, except one task directed subjects to sort according to transaction cycle, 
and the other according to audit objective. Data from these "directed sort" tasks were tested to 
ensure that differences between the "type of category given" treatments could not be attributed 
to differences in subjects' knowledge of thetransaction cycles and audit objectives related to each 
error. The experiment concluded with subjects answering some demographic questions about 
age, years of experience in public accounting and auditing, title, client industry emphasis, and 
whether English was a first or second language. 


7 A higher probability of error might influence not only budgeted hours of audit effort, but also the timing, extent and level 
of staff assigned to audit tests. To the extent that the audit hour allocation task does not encompass these audit decisions, 
results are biased against supporting H2. 

® Recall that whether or not the unconditional frequency estimation test preceded the unconditional audit decision task 
was a between-subjects variable unrelated to this study. 


Nelson, Libby, and Bonner—Knowledge Structure and the Estimation 39 


IV. RESULTS 


Verification of Objective as Dominant Dimension in Knowledge Structure 


To determine whether our subjects possessed the objective-dominant knowledge structure 
identified by Frederick et al. (1994), subjects’ free sorts were compared to the free sort that would 
have been made had subjects sorted according to objective or cycle. To conduct these compari- 
sons, each subjects' set of nine free sort responses was converted into a nine-by-nine similarity 
matrix containing "ones" if errors were grouped together and "zeros" if errors were not grouped 
together. Nine-by-nine similarity matrices were also formed for the predetermined transaction 
cycle and audit objective categorizations presented in table 2. The fit of each subject's similarity 
matrix to a cycle and an objective categorization was measured by correlating the vector formed 
from the lower left triangle ofthe subject's similarity matrix with the vector formed from the lower 
left triangle of the predetermined cycle and objective similarity matrices. This measure of the 
correspondence between two similarity matrices is commonly called a "cophenetic correlation" 
(Sneath and Sokal 1973). 

The average cophenetic correlation between subjects’ free sort similarity matrices and the 
predetermined objective (cycle) similarity matrix is .53 (-.04). The average correlation with 
objective is significantly higher than the average correlation with cycle (t=4.22, two-tailed 
p=.0001), indicating that the audit objective dimension was more dominant than the transaction 
cycle dimension on average in our auditors’ free sorts.’ 


Test of H1 


H1 predicts that subjects’ conditional probability judgments will be influenced more by 
experimental frequencies for P(cycle error | objective) than for P(objective error! cycle). In other 
words, with an objective-dominant knowledge structure, subjects should be better able to 
discriminate event frequencies when a probability judgment is conditioned on objective than 
when it is conditioned on cycle. This implies that the difference between P(high frequency cycle 
error | low frequency objective) and P(low frequency cycle error | high frequency objective) 
should be greater than the difference between P(high frequency objective error | low frequency 
cycle) and P(low frequency objective error | high frequency cycle). Recall that each subject 
completed two conditional probability judgments (P(high! low), P(low | high)) with only one of 
the two categories (cycle or objective) as "given" (i.e., judgments were either of the form 
P(objective error! cycle) or P(cycle error | objective)). Thus, the influence of knowledge structure 
on estimation can be tested in a 2 x 2 ANOVA, with type of conditional probability estimate 
(P(high | low), P(low | high)) a within-subjects variable and given category type ((. | objective), 
(. | cycle)) a between-subjects variable. Àn interaction between these two variables of the form 
proposed above would support H1. Specifically, the influence of type of conditional probability 
estimate should be greater when the given category type is "objective." 

The mean conditional probability estimates are shown in table 4. The interaction between 
type of conditional probability estimate and given category type is significant (t=1.78; one-tailed 
p=.04)}, supporting H1. The auditors’ conditional probability estimates were influenced by 
frequency information when the conditional was of the form P(cycle | objective) (t=2.10; one- 
tailed p=.02), but not when it was of the form P(objective | cycle) (t=-0.33; one-tailed p=.63). 


? Thirty-six of the 47 subjects’ sorts (7796) were more like the predetermined objective sort than the predetermined cycle 
sort, and no results changed when only these subjects were included in analyses. The proportion of subjects whose sorts 
favored the dominant dimension is similar to results found in prior psychology research (Gavanski and Hui 1992). 
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TABLE 4 
Conditional Probability Estimates 


Analysis of Conditional Tvpe! x 
Presented Conditional Probability Contrast 


Means (Standard Deviations) [Contrast Weights] 


Presented Conditional Estimates of Estimates of 
Probability P(Objective | Cycle) P(Cycle | Objective) 
P(low | high) = 4/28 = 14.3% 48.6 (30.7) [-1] 40.6 (22.4) [+1] 


P(high | low) = 4/7 = 57.1% 46.3 (31.0) [+1] 56.8 (26.0) [-1] 


! "Conditional Type" is either P(cycle | objective) or P(objective | cycle). 


Test of H2 


H2 predicts that subjects' audit effort allocations will be influenced more by experimental 
frequencies when subjects allocate effort across transaction cycles for each audit objective than 
when subjects allocate effort across audit objectives for each transaction cycle. Recall that, in the 
effort allocation task, subjects who had estimated conditional probabilities with cycle as "given" 
allocated effort across objectives for each cycle, while subjects who had estimated conditional 
probabilities with objective as “given” allocated effort across cycles for each objective. For 
example, a subject with cycle as “given” made a total of 12 responses in the effort allocation task, 
consisting of Plow frequency objective error | cycle), P(medium frequency objective error | 
cycle), P(high frequency objective error | cycle), and P(other objective errors | cycle) for each of 
the low, medium, and high frequency cycles that appear as given in the conditional. Dropping 
allocations to P(“other’” | .), each subject's remaining nine responses can be classified in a 3 x 3 
design, consisting of "given category frequency" ((.1low), (. | medium), and (. | high)), and "event 
category frequency" (low|.), (medium .), and (highl.)). When coupled with the between-subjects 
manipulation of the type of the category given ((. | objective), (. | cycle)), the resulting model is 
a2x3x3 ANOVA. H2 would be supported by a significant given category type x linear trend!? 
in event category frequency contrast. That is, event category frequencies should influence audit 
effort allocations more when the given category type is "objective." Note that, if subjects are 
estimating the requested conditional probability, given category frequency should not have an 
effect.!! 

The mean audit effort allocations are shown in table 5. The given category type x linear trend 
in event category frequency contrast is significant (t=2.24; one-tailed p=.01).'* The linear trend 


10 See Rosenthal and Rosnow (1988) for a discussion of the use of linear trends and focused contrasts in analysis of 
variance. 

1! Referring to table 3, regardless of whether the given category frequency is low (seven presentations), medium (14 
presentations) or high (28 presentations), the relative event category frequency is 1:2:4. Therefore, the proportion of 
100 hours of audit effort allocated to a particular event category should not depend on given category frequency. 

12 A conventional test of the interaction between event category frequency and given category type yields a p-value of 

.0322 (F=3.57). 
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was significant when the given category was objective (t=2.53; one-tailed p=.008), but not when 
the given category was cycle (t=-0.55; one-tailed p=.71), supporting H2. The auditors’ audit effort 
allocations were influenced by frequency information when distributing effort across cycles for 
each objective, but not when distributing effort across objectives for each cycle. 


Ruling Out Alternative Explanations 


Knowledge of Both Cycle and Objective 

One explanation for a difference in results between the "type of category given" treatments 
is that subjects might differ in their knowledge of the cycle and objective dimensions. To test for 
this possibility, the directed sorts that subjects performed at the end of the experiment were 
compared to the sorts that would have been made had subjects sorted according to objective and 
cycle. The mean cophenetic correlation between the directed objective (cycle) sort and the 
predetermined objective (cycle) sort was .71 (.74). Thesetwo means are not significantly different 
(t=0.38; two-tailed p=.71), indicating that our subjects’ knowledge of objective and cycle was 
roughly equivalent. This result indicates that support for H1 and H2 cannot be attributed to 
differential understanding of membership of errors in objective and cycle categories. 


Knowledge of Experimental Frequencies 

Another explanation for a difference in results between the “type of category given” 
treatments is that subjects might differ in the degree to which they acquired knowledge of the 
experimentally-manipulated frequencies of error in the cycle and objective categories, and then 
simply answered the conditional probability questions using their estimates of unconditional 
cycle or objective frequencies. To test this possibility, the degree to which the experimental 
frequencies are reflected in each subject’s estimates of unconditional cycle and objective 
frequencies were evaluated and compared. The mean unconditional frequency estimates are 
shown in table 6. The linear trend in presented frequencies is significant (t=5.15; one-tailed 
p<.0001), indicating that subjects acquired unconditional frequencies from their experience in the 
experiment. The linear trend in presented frequencies x category-type (cycle or objective) 


TABLE 5 
Conditional Audit Effort Allocations 


Analysis of Conditional Type! x 
Linear Trend in Presented Event Category Frequencies? Contrast 


Means (Standard Deviations) [Contrast Weights] 


Presented Conditioned Conditioned 
Frequency on Cycle on Objective 
7 33.3 (15.5) [-1] 25.3 (15.1) [+1] 
14 27.5 (14.4) [0] 31.5 (13.2) [0] 
28 31.0 (16.1) [+1] 37.4 (17.3) [-1] 


1 “Conditional Type” is either P(cycle | objective) or P(objective | cycle). 

2 "Presented Event Category Frequencies” are the frequencies of the events that are conditioned on either objective or 
cycle categories. For example, assume that the Sales and Receivable cycle had a frequency of 28. In the conditional 
probability P(Sales Error | Cutoff), Sales is the event being conditioned on Cutoff, so “presented event category 
frequency” has a value of 28. 


4 
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contrast is not significant (t=0.00; two-tailed p=0.95), indicating that the degree to which 
subjects’ frequency estimates reflected experimental frequencies did not differ significantly 
between the cycle and objective categories. These results indicate that support for H1 and H2 
cannot be attributed to differential learning of unconditional frequencies. Also, because subjects 
estimated unconditional cycle and objective frequencies equally well, but differed in the degree 
to which conditional probabilities reflected frequencies depending on whether the probability 
was conditioned on cycle or objective, we can conclude that subjects were not merely answering 
the conditional probability questions with their estimates of unconditional probabilities. 


Knowledge of Non-Experimental Frequencies 

A. final possible explanation for the results is that pre-existing frequency knowledge 
interfered with the experimental frequencies in some systematic way. Prior research, the design 
of our experiment, and further analyses all indicate that this is not a plausible explanation. First, 
as mentioned previously, prior research indicates that frequency knowledge is time-tagged in 
memory (Hintzman and Block 1973; Hintzman et al. 1973; Reichardt et al. 1973), such that 
experimental frequencies can be discriminated from pre-existing frequency knowledge (Butt 
1988; Nelson 1993, 1994). Second, recall that an element of our design was that the frequencies 
of the cycle and objective categories (and thus of the errors) were assigned randomly between 
subjects. Examination of the assignments that occurred indicated that this random assignment 
was successful—each category appeared as the low, medium, and high frequency category 
approximately the same number of times. Therefore, any influence of pre-existing frequency 
knowledge that did occur should have been distributed across treatments, and thus would only 
have decreased the power of our tests. Third, to insure that pre-existing frequency knowledge did 
not bias results, we asked a separate group of 46 auditors who were enrolled in the same staff 
training session to rate the relative frequencies of the three cycle and the three objective categories 
as part of another study. The auditors were not exposed to the experimental frequencies, so these 
ratings constituted their estimates of real-world frequencies. Results indicated that valuation 
errors are considered to occur relatively more frequently than cutoff and validity errors, and that 


D Whether the unconditional frequency estimation task was completed before or after the unconditional audit decision task 
did not influence the effect of actual frequency on unconditional frequency estimate (t=0.79; two-tailed p-.43). 


TABLE 6 


Unconditional Frequency Estimates 
Analysis of Category (Cycle v. Objective) x Linear Trend in Presented 
Frequencies Contrast 


Means (Standard Deviations) [Contrast Weights] 


Unconditional Unconditional 
Presented Objective Cycle 
Frequency Estimates Estimates 
7 12.5 (7.1) [-1] 13.0 (7.1) [+1] 
14 12.8 (5.9) [0] 13.8 (6.9) [0] 
28 18.4 (7.7) [+1] 18.9 (9.0) [-1] 
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investments errors are considered to occur relatively less frequently than sales and inventory 
errors. Therefore, hypothesis H1 was re-tested after omitting those subjects whose conditional 
probability estimates involved either the valuation objective or the investments cycle. Results 
were unchanged. This result indicates that support for the hypotheses is not driven by interference 
from pre-existing knowledge of error frequencies. 


IV. DISCUSSION 


As noted previously, a growing literature in accounting examines the influence of knowledge 
structure on important cognitive processes like recall and frequency estimation. For the most part 
this literature concludes that structure enhances audit judgment. For example, in their recent 
examination of 25 studies of heuristics and biases in auditing, Smith and Kida (1991) document 
that biases often are mitigated when knowledgeable subjects perform familiar tasks. This 
possibility was first raised by Joyce and Biddle (1981a,b). However, Smith and Kida further point 
out that Frederick and Libby's (1986) results using a highly abstract task suggest that, depending 
on the relationship between the auditor's knowledge structure and the task, knowledge could 
either help or hinder auditors' judgments. The current paper demonstrates a case where the 
auditor's knowledge structure, as developed through experience, actually hinders the auditor's 
ability to apply knowledge acquired through experience to an important audit judgment. While 
the estimation of conditional probabilities (required to test our hypothesis 1) is an abstract task, 
the allocation of audit effort to audit objectives within transaction cycles (used in testing 
hypothesis 2) is a relatively familiar, intuitive task to most experienced auditors. 

Several specific results merit further discussion. The free sort results replicate Frederick et 
al. (1994) by demonstrating that audit objective dominates transaction cycle in experienced 
auditors' free sorts of audit errors. Although this structure might be quite useful for a variety of 
audit tasks, the results of tests of H1 and H2 suggest that it hinders auditors' ability to draw on 
previously experienced frequencies when estimating conditional probabilities of the form 
P(objective error | cycle) and when allocating audit hours to various objectives within transaction 
cycles. Othertests indicate that these results cannot be attributed to interference from pre-existing 
frequency knowledge or to differences in understanding or frequency learning between cycle and 
objective categories. 

These results extend the results of prior psychology research in three ways. First, we 
demonstrate that the influence of knowledge structure occurs with respect to categories and 
stimuli whose features are less saliently defined than those used in prior research (e.g. Gavanski 
and Hui 1992; Sherman et al. 1992), supporting the generality of the effects identified in prior 
research. Second, we demonstrate that the influence of knowledge structure occurs for experi- 
enced professionals making intuitive, familiar judgments, indicating that prior results are not due 
to semantic confusion in answering conditional probability questions and suggesting that these 
effects are not averted through adaptation in meaningful contexts. Third, we demonstrate that the 
influence of knowledge structure on conditional probability judgments is powerful enough to be 
observed in decisions, even when the decisions are structured to provide category names for use 
as retrieval cues. 

From an audit practice perspective, these results indicate that the current structure of audit 
planning judgments may not always be conducive to auditors utilizing their knowledge of the 
frequencies of previously experienced errors. This does not necessarily imply that the audit 


“Because H2 required subjects to allocate hours of audit effort across either all cycles or all objectives, it could not be 
tested for a subset of cycles or objectives. 
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planning task should be restructured to better match auditors’ knowledge organization. It is 
unlikely that the benefits of this restructuring would exceed the costs, given that the current cycle- 
dominant structure facilitates the identification of inherent and control risks that vary from cycle 
to cycle. However, auditors! knowledge organizations could be changed to better match the audit 
planning task. For example, training approaches could be used to render cycle equally or more 
dominant than objective in the knowledge structures of experienced auditors and thus better 
match knowledge structure to audit structure. Likewise, college instruction or early staff training 
could be modified to communicate better to novice auditors a cycle-dominant organization that 
matches the structure of the audit decisions to which their future professional experience will 
eventually be applied. Finally, decision aids might be constructed that provide auditors with 
retrieval cues for individual errors and combination rules for formation of appropriate conditional 
probabilities. 


APPENDIX A 
. Sort Task Screen 


12 3 Next perlod's sales were Included in the current year's revenue and receivables. 
12 3 More finished goods were recorded than were actually received. 

1 2 3 Marketable securities were not radioed to lower of cost or market. 

12 3 Fictitious Investments were Included in the account balance. 

1.2 3 The bad debt expense and allowance were underestimated. 

l 2 3 Raw materials were improperly shown as received after year end. 

12 3 Obsolete inventory was not written down to net realizable value. 

1 2 3 Purchases of treasury bilis were recorded in the wrong fiscal period. 


1 2 3 Billings to legitimate customers were booked twice. 





Classify the errors into 3 groups. Each 
group should contain the errors that you 
believe go together. After you are 

finished, click on "done" to continue. 
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APPENDIX B 
Conditional Probability Estimation Screen 
Please base your answer to the following question on the error 
. frequencies to which you were exposed earlier in the study. 


Given that you are dealing with an error that occurred in the 
TRADE RECEIVABLES, SALES AND COLLECTIONS CYCLE, what is 
the probability that the error violated the VALIDITY 
OBJECTIVE? 


Please respond by moving the slider to the correct probability. 
on the percentage scale below. 


i ico 





APPENDIX C 
Conditional Audit Decision Screen 


Assume you have a hundred hours to allocate to finding errors that occur 
In the TRADE RECEIVABLES, SALES AND COLLECTIONS cycle. Please 
allocate the hours to the following categories of audit objective: 





VALUATION _ 
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The Effects of Time Pressure and 
Knowledge on Key Word Selection 
Behavior in Tax Research 


Brian C. Spilker 
Brigham Young University 


ABSTRACT: This paper considers how time pressure and knowledge separately 
and jointly affect tax researchers' ability to locate relevant authority. Tax professlon- 
als and graduate tax students participated in a computer Interactivo experiment in 
whlch subjects selected relevant key words relating to a partnership tax issue. The 
results indicate that declarative and procedural knowledge enhance tax researchers' 
ability to select relevant key words in a time-restricted task. The most significant 
finding is that subjects with procedural knowledge responded more positively to time 
pressure than did subjects without such knowledge, thereby demonstrating an 
interaction between time pressure and knowledge. 


Key Words: Time pressure, Declarative knowledge, Procedural knowledge, Infor- 
mation search. 


Data Avallability: The data used in this study are available upon request. 


I. INTRODUCTION 


ax professionals seek relevant authority within tax databases (e.g., Internal Revenue 
Code, Treasury Regulations, revenue rulings, decided court cases, etc.) to resolve issues. 
Information services such as Commerce Clearing House' s Standard Federal Tax Reporter 
(CCH) and The Research Institute of America's United States Federal Tax Reporter (RIA) 
provide a common starting point for tax information search. Tax researchers usually access these 
services through "key word" indexes by selecting key words that will likely refer them to relevant 
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authority (i.e., relevant key words). In fact, Gardner and Stewart (1993, 125) indicate that “the key 
to utilizing each tax service effectively lies in the mastery of its index systems.” The greater the 
researcher’s ability to select a diverse set of relevant key words for a given issue, the more likely 
the researcher is to find relevant authority and avoid erroneous judgments and missed 
opportunities. ! 

The purpose of this study is to examine the separate and joint effects of time pressure and 
knowledge, two important variables in the tax context, on tax researchers’ ability to select relevant 
key words. Understanding the effects of time pressure on tax researcher performance is important 
because, through time budgets, tax managers generally have control over the amount of available 
search time. Understanding how different types of knowledge lead to performance differences 
has potential implications for practice management and for the development of professional 
training programs designed to improve tax researcher performance. 

Fifty-one full-time graduate tax students at the beginning of their degree program, 53 full- 
time graduate tax students at the end of their degree program, and 43 tax professionals from a Big- 
6 accounting firm participated in a computer interactive experiment in which subjects selected 
relevant key words relating to a partnership tax issue. Subjects also completed a control case 
involving an international tax issue to help rule out alternative explanations for the results. The 
findings indicate that (1) subjects with significant declarative partnership tax knowledge selected 
a greater number of relevant key words than subjects with little or no declarative knowledge, (2) 
subjects with significant procedural partnership tax research knowledge and significant declara- 
tive partnership tax knowledge selected a greater number of relevant key words than subjects with 
significant declarative knowledge only, and (3) time pressure had a more positive effect on 
subjects with significant procedural partnership tax research knowledge than on subjects lacking 
such knowledge. 

While researchers have speculated about the nature of the interaction between time pressure 
and knowledge, this study is the first to empirically consider how the effects of time pressure 
depend on knowledge. The results are consistent with the speculation in the literature (1.e., Brown 
and Solomon 1992; Gibbins 1984) that more knowledgeable accountants have greater ability to 
adapt to time pressure than less knowledgeable accountants. Also, while much of the accounting 
time pressure literature focuses on negative effects of time pressure (e.g., Brown and Solomon 
1992; Choo and Firth 1993; McDaniel 1990), this article considers potential positive aspects. 
These aspects are important because firms use time budgets to enhance the productivity of tax 
researchers rather than to detract from it. Finally, this study contributes to the literature by 
providing experimental evidence supporting the work of researchers who describe the process by 
which accountants develop expertise (e.g., Bédard and Chi 1993; Gibbins 1988; Waller and Felix 
1984). 

The remainder of this article is organized as follows. The next section provides the relevant 
theories and develops the hypotheses. Section 3 describes the experimental method. Section 4 
presents the results of the experiment. The final section summarizes the results and offers 
concluding comments. 


H. THEORY AND DEVELOPMENT OF HYPOTHESES 
Time Pressure 


Time pressure is a specific form of stress that has been studied in various decision-making 
contexts, including auditing (e.g., Brown and Solomon 1992; McDaniel 1990), marketing (e.g., 


! Gardner and Stewart (1993, 125-126) provide an example of this idea involving the CCH and the RIA indexes. 
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Heroux et al. 1988), and business management (e.g., Bronner 1982). Decision maker performance 
under stress (i.e., time pressure) is theorized to follow an inverted-U-shaped function (e.g., 
Anderson 1976; Bronner 1982; Easterbrook 1959). As time pressure increases from low to 
moderate levels, performance improves as decision makers narrow their focus to exclude 
peripheral or task-irrelevant cues while still considering the central or task-relevant cues 
(Easterbrook 1959). In addition, decision makers increase processing speed, which enables them 
to consider more decision-relevant information within the allotted time. As time pressure 
increases from moderate to high (or extreme) levels, performance declines as decision makers 
narrow their focus to the extent that even task-relevant cues are excluded from consideration. 
Moreover, decision makers continue to increase processing speed such that their ability to fully 
process and make use of available relevant information is inhibited by limitations in processing 
capacity (Anderson 1976; Ben Zur and Breznitz 1981; McDaniel 1990). 

Applying these ideas to key word selection behavior in tax research suggests that as time 
pressure increases from low to moderate levels, tax researchers should direct their focus toward 
finding key words that they consider most likely to refer them to relevant authority and increase 
processing speed in order to consider more potentially relevant key words in the limited time 
period. These responses to time pressure should enhance the extent to which tax researchers select 
relevant key words in a time-restricted task, and thus increase their proficiency at finding relevant 
authority in restricted time periods. To explore the effects of time pressure in a key word selection 
task, the following hypothesis will be tested: 


H1: When the subjects’ objective is to select all the relevant key words in the index, subjects in 
the moderate time pressure condition will select a greater number of relevant key words than 
subjects in the low time pressure condition. 


Knowledge 


A considerable amount of research has been conducted investigating the effects of knowl- 
edge on performance in a number of auditing-related tasks (e.g., Bonner 1990; Bonner and Lewis 
1990; Frederick 1991; Frederick and Libby 1986; Tubbs 1992). Incontrast, research investigating 
the role of knowledge in explaining performance in tax-related tasks has been limited to a small 
number of studies (e.g., Bonner et al. 1992; Marchant et al. 1991). A primary motivation cited for 
studying knowledge effects is that doing so increases our understanding of the nature of 
accounting expertise (Bédard and Chi 1993; Libby 1993). 

Both declarative knowledge and procedural knowledge have been hypothesized to be 
important for performance in professional tasks such as tax research (Davis and Solomon 1989; 
Gibbins 1988; Waller and Felix 1984). Declarative knowledge is knowledge of facts and concepts 
in a domain (Anderson 1990; Waller and Felix 1984), while procedural knowledge is knowledge 
about how to perform a task (Anderson 1987; Bonner and Pennington 1991; Waller and Felix 
1984). The basic units of procedural knowledge are if-then condition-action rules which specify 
that if a particular condition occurs, then a particular action takes place (Anderson 1982, 1987)? 
Whereas declarative knowledge is generally believed to be acquired through instruction, 
procedural knowledge is generally believed to be developed through experience at the task. 

Research in accounting provides evidence that declarative knowledge and procedural 
knowledge are associated with improved performance. For example, Bonner et al. (1992) found 
that a measure of declarative corporate tax knowledge correlated positively with the number of 





2 Sec Anderson et al. (1991) for a thorough discussion of the nature of condition-action rules in accounting contexts. 
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tax issues identified and that measures of declarative and procedural corporate tax knowledge 
correlated positively with a combined measure of the quantity and quality of issues identified. In 
addition, Bonner and Walker (1994) found that a measure of procedural knowledge related 
positively to performance in a ratio analysis task. Likewise, research in other domains suggests 
that both types of knowledge affect performance positively (Anderson 1990). 

The declarative knowledge relevant to performing tax research in a particular domain 
consists of knowledge of both the tax rules applicable to that domain and the technical terms used 
to describe such rules. In terms of key word selection behavior, this knowledge is critical for 
performance because it allows the tax researcher to distinguish between relevant and irrelevant 
key words. For example, when researching a tax issue relating to how the liabilities of a 
partnership affect the tax basis of a partner’s partnership interest (the research issue used in this 
study), declarative knowledge critical for performance consists of knowledge of (1) the fact that 
in computing the basis, the liabilities of the partnership are allocated to the partners, (2) the 
definition of non recourse debt, (3) the definition of recourse debt, and (4) the fact that the method 
of allocation for non recourse debt is different from the method of allocation for recourse debt. 
Without this declarative knowledge, researchers are unlikely to recognize relevant key words in 
the index that refer to these concepts, while researchers with this declarative knowledge should 
recognize these key words. The more detailed the relevant declarative knowledge, the greater 
number of relevant key words the researcher will be able to recognize and select. To investigate 
the effect of acquiring declarative knowledge in related tax matters on the ability of tax 
researchers to find relevant authority, the following hypothesis will be tested: 


H2A: When the subjects’ objective is to select from the index all the relevant key words relating 
to a partnership tax issue, subjects with significant declarative partnership tax knowledge 
will select a greater number of relevant key words than subjects with little or no declarative 
partnership tax knowledge. 


The performance of tax researchers with equivalent declarative knowledge in a domain could 
differ due to procedural knowledge at performing tax research in the domain. Procedural 
knowledge should be important to performance in tax research tasks, because it affects how the 
tax researcher looks for relevant authority. In terms of key word selection behavior, procedural 
knowledge consists of the rules that guide the tax researcher's search for relevant key words. For 
example, a tax researcher' s procedural knowledge base could include the following condition- 
action rule: Jf the objective is to select from the index relevant key words relating to how the 
liabilities of a partnership affect the tax basis of a partner's partnership interest, then during the 
search look under the main heading “Non Recourse Liabilities” and/or the main heading “Partners 
and Partnerships.” In contrast, a tax researcher lacking procedural knowledge is likely to respond 
to the same conditioning information with a trial-and-error approach to locating relevant key: 
words (Anderson 1987). Because procedural knowledge guides the tax researcher to areas of the 
index most likely to contain relevant key words, researchers with this knowledge should be able 
to select a greater number of relevant key words in a time-restricted task than tax researchers 
lacking such knowledge. To investigate the effect of procedural knowledge for performing tax 
research in a domain on the ability of tax researchers to find relevant authority, the following 
hypothesis will be tested: 


H2B: When the subjects’ objective is to select from the index all the relevant key words relating 
to a partnership tax issue, subjects with significant procedural and declarative partnership 
tax knowledge will select a greater number of relevant key words than subjects with 
significant declarative partnership tax knowledge only. 


-~ 
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Time Pressure and Knowledge Interaction 


The nature of procedural knowledge is such that it should facilitate successful adaptation to 
time pressure in tax research tasks. When choosing a search procedure, tax researchers are likely 
to place importance on the potential for a particular procedure to accomplish the search objective 
(given the demands of the task) and the cognitive costs involved in implementing that procedure 
(Beach and Mitchell 1978; Payne 1982; Payne et al. 1988, 1990; Russo and Dosher 1983). A tax 
researcher with significant procedural knowledge should have access to strategies that allow 
adaptation to time pressure to a greater extent than tax researchers without such knowledge. For 
example, a tax researcher with relevant procedural knowledge may adapt to time pressure by 
implementing the following condition-action rule: Jf the objective is to search for relevant key 
words and there is significant time pressure, then early in the search look in the areas of the data- 
base (main headings) that are likely to contain the greatest number of relevant key words. In 
contrast, a tax researcher lacking procedural knowledge is less likely to have an understanding 
of which areas in the database are potentially most important for selecting key words. Conse- 
quently, without procedural knowledge, the tax researcher will likely be limited to a trial-and- 
error approach to selecting relevant key words, regardless of the level of time pressure (Anderson 
1987). 

In non-time-pressured situations, tax researchers with significant procedural knowledge are 
less likely to organize their search as they do in time-pressured situations for at least two reasons: 
(1) because in non-time-pressured situations there is likely to be a perception that time is available 
to perform a more comprehensive search, the task demands do not place a premium on search 
efficiency, and (2) adaptations to time pressure are likely to require greater cognitive effort 
(Bronner 1982; Easterbrook 1959; Spilker and Prawitt 1994). To investigate the effect of 
procedural knowledge on the ability of researchers to find relevant authority as time pressure 
increases within a range, the following hypothesis will be tested: 


H3: The magnitude of the time pressure effect predicted by H1 is larger for subjects with 
significant procedural partnership tax knowledge than for subjects without such 
knowledge. 


IIl. RESEARCH METHOD 

Overview 

The research hypotheses were addressed experimentally..The experiment included two 
levels of time pressure and three groups of subjects. Each subject completed an experimental case 
and a control case; both cases were administered on computers. For each case, subjects were 
presented with a set of facts, a tax issue, and a key word index. Subjects were asked to select from 
the index all the key words they believed would refer them to tax authority relevant to eee 
the tax issue described in each case. The dependent variable was the number of “relevant” key 
words selected within a specific time period. Before starting the actual experiment, subjects 
worked through a practice case designed to help them understand the specifics of the experimental 
task. The details of the experiment are described below. 


Subjects 


The first group of subjects consisted of 51 full-time graduate tax students at a large university 
who were at the beginning of their degree program and had no work experience in taxation. The 
second group consisted of 53 full-time graduate tax students enrolled at a second large university 
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who were near the end of their degree program.’ These subjects had very little work experience 
in taxation (mean of 1.2 months; median of 0 months). The third group consisted of 43 tax 
professionals from a Big-6 accounting firm.* These subjects reported an average of 30.5 (median 
of 28) months of tax experience? The students’ participation was required as part of their 
respective courses. The professionals participated at the request of a partner in their respective 
offices. | 


Materials 


The materials consisted of a partnership tax case, a control case, questions relating to the 
cases, a background questionnaire, and two 10-question, true/false, multiple-choice tests, one 
dealing with the subject matter in the partnership case and the other with the subject matter in the 
control case. Each case consisted of a set of facts, a tax issue, and an abbreviated version of the 
key word index to the 1992 Commerce Clearing House (CCH) tax research service. The indexes 
consisted of main headings, first-level subheadings (or first-level key words), and second-level 
subheadings. First-level key words related only to the main heading under which they were listed; 
second-level key words related only to the first-level key word under which they were listed. At 
any given time, the index could display only main headings, first-level key words, or second-level 
key words. First-level key words were accessible only through main headings; second-level key 
words were accessible only through first-level key words. All main headings had related first- 
level key words, but not all first-level key words had related second-level key words. For purposes 
of measuring the dependent variable, second-level key words and first-level key words that had 
no related second-level key words were “selectable” items. The index for the partnership case 
consisted of 1,336 selectable key words, of which 644 were first-level key words. The index for 
the control case was of similar size. Each index included 45 main headings (15 main headings to 
- a screen X 3 screens); the total number of key words under each main heading was proportional 
to the number of key words under that main heading in the actual CCH index. 


Procedures 


Subjects were first briefed on the experimental instructions, after which they worked on a 10- 
minute practice case involving subject matter unrelated to the experimental and control cases.’ 
During the first half of the practice case, subjects received “hands-on” instruction on the program 
operating procedures. During the second half, subjects worked on their own to further familiarize 
themselves with the program. On concluding the practice case, participants proceeded with the 
experimental and control cases. The presentation order of the cases was counterbalanced across 
experimental conditions. 


* Due to the timing of the study and to the sequencing of tax courses, it was not possible to get the desired groups of student 
subjects from the same university. A previous study that used graduate tax students from the same two universities found . 
no differences in demographics and no differences in experimental responses between the two student groups (Limberg 
et al. 1994). 

` * Originally, the three groups consisted of 52, 55, and 50 subjects respectively. One subject from the first group was 

` eliminated because his responses were inadvertently erased. Two subjects from the second group were eliminated 
because they did not follow instructions. Six subjects from the third group were eliminated because they lacked work 
experience in partnership taxation. Finally, one subject from the third group was eliminated because the software did 
not record his responses. 

> Subjects with two to three years experience were used since they are the ones who do the bulk of initial tax research. 

$ The associated main heading was displayed with first-level key words. The associated main heading and first-level key 
word were displayed with second-level key words. 

? The practice case involved the issue of whether an employee could claim deductions relating to the unreimbursed 
purchase of a math co-processor that he bought solely to increase the processing speed of his computer. 
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To remind subjects of their objective, the following message was displayed on the partici- 
pants’ computer screens at the beginning of each case: 


The following is a case that contains a set of facts, a tax issue, and an abbreviated 
version of the key word index to the CCH tax information service. You will be selecting 
the key words in the index that you believe would refer you to authority relevant to 
resolving the specific issue described in the case if you were to look up the reference in. 
the actual CCH service (examples of authority include Code Sections, Regulations, court 
cases, revenue rulings, etc.). YOUR OBJECTIVE IS TO SELECT ALL OF THE 
RELEVANT KEY WORDS IN THE INDEX WITHIN THE BUDGETED TIME. 


The subjects were then presented with the problem statement (i.e., the facts and issue) for the case. 
The problem statement for the partnership case is reproduced in appendix A, panel A; the problem 
statement for the control case is reproduced in appendix B, panel A. By pressing a key, 
participants could go from the problem statement to the index of key words. They could return 
. to the problem statement at any time by pressing the same key. To simulate the time required to 
“look up” the reference in the actual research service, the remaining search time was reduced by 
30 seconds each time a subject "selected" a key word.* Subjects were informed of this key word 
selection "cost" before the experiment began. Moreover, the computer screen displayed a 
message reminding them of the 30-second reduction each time they selected a key word. 
Participants could terminate their search at any time, but were instructed to do so only when they 
believed they had selected all the relevant key words in the index. After completing the first case, 
subjects proceeded to the second case. On concluding the second case, subjects answered some 
additional questions relating to both cases, took the tax knowledge tests, and completed the 
background questionnaire? 


Relevance Assessment 


The index to the partnership case contained 19 "relevant" key words that were selected from 
the actual CCH index by a panel of partnership tax experts. The index to the control case contained 
22 relevant key words selected from the actual CCH index by a panel of international tax experts.!? 
If the treatment of any key word was not agreed on by all panel members, it was not included in 
the index. The relevant key words for the partnership case are presented in appendix A, panel B; 
the relevant key words for the control case are presented in appendix B, panel B. 


Dependent Measure 


The dependent measure is the number of "relevant" key words selected during the first 11 
minutes of the task. While the number of relevant key words selected is not a direct measure of 
performance at tax research, it is expected to correlate positively with a tax researcher's ability 
to find relevant authority, for at least two reasons. First, a single key word may not refer to all the 
relevant authority, because many tax issues are affected by multiple tax laws. Second, authority 


* The software would not allow subjects to make selections when they had less than 30 seconds remaining. 

? Subjects were allowed as much time as they needed to complete the additional questions, the knowledge tests, and the 
background questionnaire. 

10 Members of the partnership tax expert panel specialize in partnership taxation and members of the international tax 
expert panel specialize in international taxation. The partnership tax panel included a partner, a senior manager, two 
managers (all from Big-6 accounting firms), and a tax professor. The international tax panel included two senior 
managers from a Big-6 public accounting firm, two tax directors for multinational corporations (one of whom had been 
a senior manager and the other a manager for a Big-6 accounting firm), and a tax professor. 
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may be indexed under any number of potential key words. Gardner and Stewart (1993) point out 
that tax researchers must be able to use multiple approaches (i.e., select multiple key words) to 
the same tax problem to maximize the likelihood of finding relevant authority. 


Independent Variables 


The design includes two between-subjects variables: time pressure (two levels), and 
knowledge (three levels). Time pressure was either low or moderate and was manipulated as 
follows. First, subjects in the low time pressure condition were told they had 25 minutes to 
complete each case; subjects in the moderate time pressure condition were told they had 11 
minutes to complete each case.'! Second, the timer, which was displayed on each subject's 
computer screen, counted up the elapsed time for subjects in the low time pressure condition and 
counted down the remaining time for subjects in the moderate time pressure condition. Finally, 
subjects in the low time pressure condition were told that 25 minutes should be “more than 
adequate” time to complete each case, while subjects in the moderate time pressure condition 
were told that they would have “only” 11 minutes to complete each case. Each experimental 
session was randomly assigned to one of the two time pressure conditions.” 

Because time pressure, rather than time per se, is the variable of interest, only selections made 
by subjects in the first 11 minutes of each case were included in the analysis. Subjects in the 
moderate time pressure condition were aware of the impending 11-minute deadline, while 
subjects in the low time pressure condition were not. As a result, the subjects’ perceptions of time 
pressure were manipulated between time pressure conditions, while the actual time each subject 
had to select relevant key words was held constant. 

The three knowledge groups are the “naive” group, the “declarative” group, and the “proce- 
dural” group. The naive group, formed to include subjects who had little or no declarative or 
procedural partnership tax knowledge, consisted of the incoming masters of tax students. The 
declarative group, formed to include subjects who had significant declarative partnership tax 
knowledge but little or no procedural partnership tax knowledge, consisted of the graduating 
masters of tax students. Finally, the procedural group, formed to include subjects who had 
significant declarative and procedural partnership tax knowledge, consisted of the experienced 
tax professionals. The types of knowledge possessed by the naive, declarative, and procedural 
groups are consistent with the types of knowledge held by accountants in the "naive," “educated,” 
and "experienced" stages, respectively, of expertise development proposed in Gibbins (1988). 


Measures of Knowledge Used to Validate Differences between Subject Groups 


Different measures were used to verify the expected declarative and procedural knowledge 
differences between groups. Declarative partnership tax knowledge was measured with a 10- 


! The time limits were set based on the results of pilot studies using graduate tax students and Ph.D. students with tax work 
experience. Twenty-five minutes was deemed to be the maximum amount of time that could be allowed for each case 
and still keep the time requirements for the entire experiment under one and one-half hours (important for scheduling 
purposes). Eleven minutes allowed subjects a reasonable amount of time to select key words while still making them 
feel time pressure. 

? Discussions with subjects in the pilot studies provided some informal evidence that a timer counting down the remaining 
time caused greater perceptions of time pressure than a timer counting up the elapsed time. 

4 The professionals participated at their respective offices in groups of two to four. The students participated at their 
respective universities in groups of two to six. The experimenter was present at all sessions. 

M^ No subjects in the low time pressure condition terminated their search before the 11-minute cut-off in either case. No 
subjects in the moderate time pressure condition terminated their search early in the control case and only three subjects 
inthe moderate time pressure condition terminated their search early in the partnership case (by an average of 19 seconds 
each). 

13 The expected knowledge differences were confirmed with various measures, which are explained in the result section. 
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question, true/false, multiple-choice test constructed from information in various tax resource 
materials.ó The test covered such topics as definitions of recourse and non recourse liabilities 
and the identification of factors that affect the tax basis of a partner’ s partnership interest." This 
method of measuring declarative knowledge is similar to that used in Bonner and Walker (1994). 

Procedural knowledge was measured with subjects' responses to questions relating to the 
extent and types of theirtax work and research experiences. Experience measures are used to infer 
procedural knowledge in this study, due to difficulties involved in testing procedural knowledge 
directly (Bonner and Walker 1994). The use of experience measures in determining the extent of 
procedural knowledge is consistent with Anderson's (1982) theory that procedural knowledge 
develops through experience at the task and is supported by the empirical work of Bonner et al. 
(1992), who found that measures of experience (similar to those used in this study) related 
positively to procedural knowledge. The results of the validation tests are described in the results 
section. 


Control Case 


Subjects completed a control case to help rule out alternative explanations for the results. In 
contrast to the partnership case, the control case is in a domain (international taxation) in which 
none of the three knowledge groups had received significant instruction or had significant 
experience. Because no between-group knowledge differences were expected in the control case, 
the absence of between-group performance differences in the control case provides assurance that 
between-group performance differences in the partnership case are due to knowledge differences 
and not other subject differences (Frederick 1991; Frederick and Libby 1986; Libby 1993). 


IV. RESULTS 


Time Pressure Manipulation Check 


Subjects evaluated the amount of time pressure they experienced during the task. Their 
responses were made on a five-point scale anchored on one end by “no time pressure" and on the 
other end by "extreme time pressure." For reporting purposes, the "no time pressure" anchor was 
assigned a value of zero; the "extreme time pressure" anchor was assigned a value of 100. The 
mean score for subjects in the moderate time pressure condition (57.0) and the mean score for 
subjects in the low time pressure condition (37.5) are significantly different from each other 
(F = 23.16; p = .00). The manipulation check for the control case yielded similar results. 


Verification of Subject Group Knowledge Differences 


Declarative Knowledge 

For the test of declarative partnership tax knowledge, the mean score of subjects in the 
procedural group (6.84 out of a possible 10) and the mean score of subjects in the declarative group 
(6.83) are both significantly higher (p « .05) than the mean score of subjects in the naive group 
(3.78). These scores indicate that, as expected, the procedural and declarative groups are roughly 
equivalent in terms of their declarative partnership tax knowledge, and both groups have more 


l6 Declarative international tax knowledge was measured with a test of similar form. 

1 For validation purposes, the tests were given to three partnership tax experts (partnership test only), three international 
tax experts (international test only), a group of 28 pilot subjects who were expected to have declarative partnership tax 
knowledge and little or no international tax declarative knowledge, and a group of 34 pilot subjects who were expected 
to have little or no declarative partnership tax or international tax knowledge. The results indicate that the test 
scores are high when those taking the tests are expected to have the knowledge being tested; the scores are low when 
they are not. 
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declarative tax knowledge than the naive group. For the international tax test, the mean score of 
subjects in the procedural group (3.60) is significantly higher than both the mean score of subjects 
in the declarative group (2.64) and the mean score of subjects in the naive group (2.39). The mean 
scores of subjects in the declarative and naive groups are not significantly different from each 
other. While it was anticipated that subjects in all three groups would have a similar level of 
declarative international tax knowledge, these scores suggest that subjects in the procedural group 
had more declarative international tax knowledge than subjects in either of the other two groups. 


Procedural Knowledge 

Following the method used in Bonner et al. (1992), responses to the five experience questions 
and subjects’ scores on the partnership tax test were standardized and correlated. A factor analysis 
using the principal components method of extraction with varimax rotation was applied to the 
correlation matrix. The analysis indicates that four factors account for 86 percent of the variance 
in the six variables, each factor accounting for at least 10 percent of the variance. Table 1 presents 
the factors, the percentage of variance explained by each factor, and the factor loadings. 

Factor | loads high on subjects’ work experience, percentage of time working on partnership 
tax matters, and subjects’ partnership tax research experience. This factor explains more variance 
in the six variables (51 percent) than the other three factors combined. If, in accordance with 
Anderson’s 1982 theory, procedural knowledge develops through experience at the task, then 
factor 1 should relate positively to procedural knowledge. Accordingly, it is the focus of the 
procedural knowledge verification. 


TABLE 1 
Factor analysis of variables from additional questions, 
background questionnaire, and knowledge test. 


Factor Loading* 

Variables Factor! Factor2 Factor3 Factor4 
Experience in months .82 27 .02 .13 
Percentage of work time 
spent on partnerships .82. 14 27 ;13 
General research experience 4 91 .08 22 
Partnership research experience 42. 2 .26 .12 
Number of times encountered issue 
(in practice or in the classroom) similar 
to the issue in the partnership case .06 .17 95 Al 
Score on partnership knowledge test .00 24 .12 94 
Proportion of Variance Contribution 51 13 12 10 


* Only factors with loadings above .40 (underlined) were used in computing the factor scores. 
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To test for differences between knowledge groups on this factor, factor 1 scores were 
computed for each subject. These scores were used as the dependent variable in a one-way 
ANOVA with three levels of knowledge (i.e., procedural, declarative, and naive). The ANOVA 
is significant (F = 329.88; p = .00). Post hoc tests indicate that the procedural group scored higher 
on this factor than the declarative and naive groups, and that the scores of the declarative and naive 
groups are not significantly different. This result provides support for the expected procedural 
knowledge differences between groups. Further, the existence of factor 4, which loads high on 
only the declarative knowledge test score, provides support for the idea that procedural 
knowledge and declarative knowledge are different constructs. 


Tests of Hypotheses 


H1 predicts that subjects in the moderate time pressure condition will select a greater number 
of relevant key words than subjects in the low time pressure condition. Table 2, panel A provides 
descriptive results; figure 1 displays the results graphically.'* The overall ANOVA is presented 
in table 2, panel B;? the planned comparison used to test H1 is presented in table 2, panel C.” 
While, on average, subjects in the moderate time pressure condition selected more relevant key 
words than subjects in the low time pressure condition, the difference is not significant (t — .50, 
p = .31). However, because the time pressure by knowledge interaction in the overall ANOVA 
is significant (F = 3.68, p = .03), tests of the simple main effects of time pressure were conducted. 
The effect of time pressure is positive and significant for the procedural group (F = 4.41, p = .04) 
and positive but not significant for the declarative group (F = .10, p = .75). For the naive group, 
the effect of time pressure was negative (opposite the predicted direction) and marginally 
significant (F = 2.94, p = .09).?! 

H2A predicts that subjects in the declarative group will select a greater number of relevant 
key words than subjects in the naive group. H2B predicts that subjects in the procedural group will 
selecta greater numberofrelevantkey words than subjects in the declarative group. Table 2, Panel 
C presents the results of the planned comparisons used to test these hypotheses. For H2A, the 
difference between the declarative group and the naive group is in the predicted direction and is 
significant (t = 3.10, p = .00). For H2B, the difference between the procedural group and the 


H The seemingly low number of relevant key words selected by the procedural group can be attributed in large part to time 
limitations. Because subjects spent time reading the problem statement and selecting key words (30 seconds per 
selection), they had limited time available to look for key words. Consequently, the ratio of relevant key words selected 
to total key words selected (42 percent for procedural subjects) is a more appropriate indicator of performance than is 
the ratio of relevant key words selected to total relevant key words selected. When evaluating the practical significance 
of the results, it is important to consider that one key word can mean the difference between finding and not finding the 
target authority. 

I? Because the variances between cells was not homogeneous, the data were transformed with a square-root transforme- 
tion, as suggested by Winer (1971, 399). However, since the analysis of the raw data and the analysis of the transformed 
data yield similar inferences, only the analysis of the raw data is presented here. 

7 Note that the planned comparison for time pressure is a one-tailed version of the main effect for time pressure in the 
overall ANOVA. 

? Potential alternative explanations for the significant time pressure effect on the procedural group stem from the fact that 
subjects in the moderate time pressure knew when time was to expire while subjects in the low time pressure condition 
did not. One potential explanation is that “end of game effects" drive the result. This explanation is discounted by the 
fact that an analysis of the data at 10 minutes (i.e., at a point in the experiment when neither time pressure group knew 
that time would expire) yields similar results to the data collected at 11 minutes. À second explanation is that 
subjects in the low time pressure condition located relevant key words before the 11-minute deadline but deferred 
making the selection until some point after the 1 1-minute deadline, simply because they believed they had time available 
to do so. However, an inspection of each subject' s detailed search record indicates that this did not happen. Of the 22 
procedural subjects in the low tine pressure condition, only one viewed a relevant key word before 11 minutes and 
selected that key word after the deadline. 
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FIGURE 1 
Number of relevant key words selected by condition 
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declarative group is also in the predicted direction and is significant (t = 2.95, p = .00). Thus the 
results support H2A and H2B. 

H3 predicts that the magnitude of the time pressure effect predicted by H1 is larger for 
subjects in the procedural group than it is for subjects in the other two groups. Table 2, Panel C 
presents the results of the planned comparison used to test H3. The interaction is in the predicted 
direction and is significant (t = 2.29, p = .02), supporting H3. 


Control Case 


This paper asserts that between-group differences in the level of declarative partnership tax 
knowledge are responsible for the H2A result and that between-group differences in the level of 
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TABLE2 
Effects of time pressure and knowledge on number of relevant key words selected by condition. 


Panel A: Mean number of relevant key words selected (standard deviations are in parentheses). 








Knowledge 
Time Pressure naive? declarative’ procedural 
low 1.55 
(1.36) 
n=71 
moderate 1.58 
(1.63) 
n= 76 
.78 1.62 2.42 1.57 
(1.00) (1.33) (1.71) (1.50) 
n-5l n= 53 n= 43 n= 147 


* Subjects in the naive group have little ar no declarative partnership tax knowledge or procedural partnership tax research 
knowledge. 

> Subjects in the declarative group have significant declarative partnership tax knowledge but little or no procedural 
partnership tax research knowledge. 

* Subjects in the procedural group have significant declarative partnership tax knowledge and procedural partnership tax 
research knowledge. 


Panel B: Analysis of Variance: Dependent variable is the number of relevant key words selected. 


Sum of Mean 
Source Squares df Square F p 
Time Pressure 45 1 .A5 .25 .62 
Knowledge 61.69 2 30.84 17.24 00 
TP x Knowledge 13.16 2 6.58 3.68 .03 
Error 252.20 141 1.79 


Panel C: Planned Comparisons: Dependent variable is the number of relevant key words selected. 


Planned Comparison df MS t p 

Time Pressure 

low vs. moderate ] 45 50 31 (H1) 
Knowledge: 

declarative vs. naive 1 17.25 3.10 .00 (H2A) 

procedural vs. declarative 1 15.52 2.95 60 (H2B) 
TimePressure x Knowledge: 

procedural vs. non procedural groups? 1 9.41 2.29 .02 (H3) 


å This comparison contrasts ths magnitude of the effect of time pressure on the procedural group with the magnitude of 
the effect of time pressure on a pooled group consisting of subjects in the declarative group and subjects in the naive 
group. 
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procedural partnership tax knowledge are responsible for the H2B and H3 results. The control 
case was designed to provide some assurance that the specified knowledge differences drive the 
results. As evidenced by their scores on the knowledge tests, subjects in the declarative and naive 
groups differed in terms of their declarative partnership tax knowledge but not their declarative 
international tax knowledge. Consistent with the declarative knowledge explanation for H2A, 
relative to the naive subjects, the declarative subjects selected a greater number of relevant key 
words in the partnership case but not in the control case (t = .53, p = .60). 

The procedural knowledge explanation for the H2B result cannot be directly supported by the 
control case results, because subjects in the procedural group selected more relevant key words 
than subjects in the declarative group in both the partnership case and the control case (t = 2.93, 
p = .00). However, the difference in the control case is likely attributable to the fact that, as 
evidenced by the knowledge test scores, procedural subjects had more declarative international 
tax knowledge than did declarative subjects. 

The idea that procedural knowledge is driving the time pressure and knowledge interaction 
in the partnership case is supported by the absence of the same interaction in the control case 
(t = .92, p = .36). The absence of an interaction in the control case rules out the alternative 
explanation that in all tax situations experienced tax professionals will be more successful at 
adapting to time pressure than will tax students. If experienced tax professionals better adapt to 
time pressure in all situations, the time pressure and knowledge interaction would also have been 
Observed in the control case. 


Other Data 


Table 3 provides descriptive statistics, by time pressure and knowledge conditions, relating 
to the most frequently selected relevant key words in the partnership case. In general, the 
procedural and declarative groups selected a number of relevant key words that were not 
explicitly mentioned in the problem statement, while the naive group tended to select relevant key 
words that were stated in the problem statement. For example, the procedural and declarative 
subjects tended to select key words with the words “non recourse" in them, while the naive 
subjects did not. Collapsing across time pressure conditions, 35 percent of the procedural 
subjects, 19 percent of the declarative subjects, and 0 percent of the naive subjects selected the 
key word "NON RECOURSE FINANCING: partnership liabilities." Further, 30 percent of the 
procedural subjects, 11 percent of the declarative subjects, and only 2 percent of the naive sub- 
jects selected the relevant key word "PARTNERS AND PARTNERSHIPS: Basis of partner's 
interest: non recourse loan." While the problem statement in the partnership case did not include 
the words “non recourse," non recourse liabilities are an important part of the solution to the 
problem, because when computing the basis of a partnership interest, non recourse liabilities are 
allocated to partners differently than are recourse liabilities. The naive subjects tended to select 
key words that were similar to the words in the case description. For example, all three of the most 
frequently selected relevant key words by the naive group included the words “basis” and 
“partner’s interest." 

These differences suggest that subjects with relevant declarative knowledge (the procedural 
and the declarative groups) were able to recognize relevant key words when they encountered 
them in the index, while subjects with little relevant declarative knowledge (the naive group) were 
limited to key words used in the problem statement. These descriptive results are consistent with 
findings reported in the psychology literature indicating that novices tend to focus on the surface 
features of a problem while those with more knowledge are able to focus on deeper characteristics 
(e.g., Chi et al. 1981; Gobbo and Chi 1986). 
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TABLE 3 
Most frequently selected relevant key words by time pressure and knowledge condition. 
PROCEDURAL SUBJECTS 

Moderate time pressure 
NON RECOURSE FINANCING: Partnership liabilities" (4896)" 
BASIS: Partnership liabilities: increase to (38%) 
PARTNERS AND PARTNERSHIPS: Basis of partner’s interest: 

non recourse loan (38%) 
PARTNERS AND PARTNERSHIPS: Liabilities as a part of basis (33%) 
PARTNERS AND PARTNERSHIPS: Liabilities: basis increase (24%) 
Low time pressure 
ADJUSTED BASIS: Partner’s interest (36%) 
ALLOCATION: Partnership liabilities (27%) 
PARTNERS AND PARTNERSHIPS: Basis of partner’s interest: 

loans affecting (27%) 
NON RECOURSE FINANCING: Partnership liabilities (23%) 
PARTNERS AND PARTNERSHIPS: Basis of partner’s interest: 

non recourse loan (23%) 

DECLARATIVE SUBJECTS 

Moderate time pressure 
NON RECOURSE FINANCING: Partnership liabilities (32%) 
ALLOCATION: Partnership liabilities (25%) 
PARTNERS AND PARTNERSHIPS: Basis of partners interest: 

loans affecting (21%) 
PARTNERS AND PARTNERSHIPS: Liabilities as a part of basis (18%) 
PARTNERS AND PARTNERSHIPS: Basis of partner’s interest: 

non recourse loan (11%) 
PARTNERS AND PARTNERSHIPS: Liabilities treated as distributions or 

contributions (11%) 
ADJUSTED BASIS: Partnez’s interest (11%) 
Low time pressure 
ADJUSTED BASIS: Partner’s interest (56%) 
ALLOCATION: Partnership liabilities (20%) 
BASIS: Partnership liabilities: increase to (16%) 
PARTNERS AND PARTNERSHIPS: Basis of partner’s interest: 

non recourse loan (12%) 
PARTNERS AND PARTNERSHIPS: Liabilities as a part of basis (12%) 

NAIVE SUBJECTS 

Moderate time pressure 
ADJUSTED BASIS: partner’s interest (15%) 
PARTNERS AND PARTNERSHIPS: Basis of partner’s interest: 

loans affecting (11%) 
BASIS: Partnership liabilities: increase to ( 7%) 
Low time pressure 
PARTNERS AND PARTNERSHIPS: Basis of partner’s interest: 

loans affecting (29%) 
ADJUSTED BASIS: partner’s interest (21%) 
PARTNERS AND PARTNERSHIPS: Basis of partner’s interest: 

liabilities treated as distributions or contributions (21%) 


* The key words are organized as follows: 
MAIN HEADINGS: First-level key words: second-level key words 
>The percentage of subjects who selected this relevant key word. 
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Descriptive statistics relating to the influence of time pressure on types of relevant key words 
selected also yield insight. In general, subjects in the procedural and declarative groups tended 
to select relevant key words located towards the beginning of the index when searching under low 
time pressure and key words that were positioned deeper in the index when searching under 
moderate time pressure. Subjects in the naive group tended to select similar key words in both time 
pressure conditions. The patterns of reference to the key words “ADJUSTED BASIS: partner's 
interest" and "NON RECOURSE FINANCING: partnership liabilities" illustrate these points. 
Procedural and declarative subjects selected “ADJUSTED BASIS: partner's interest" with 
frequencies of 36 percent and 56 percent, respectively, in the low time pressure condition and with 
frequencies of only 10 percént and 11 percent, respectively, in the moderate time pressure 
condition. In contrast, 21 percent of the naive subjects in the low time pressure condition and 15 
percent of the naive subjects in the moderate time pressure condition selected this key word. 
Conversely, procedural and declarative subjects selected “NON RECOURSE FINANCING: 
partnership liabilities” with frequencies of only 23 percent and 4 percent, respectively, in the low 
time pressure condition and with frequencies of 48 percent and 32 percent, respectively, in the 
moderate time pressure condition. None of the naive subjects in either time pressure condition 
selected this key word. 

The divergent patterns of reference to the “ADJUSTED BASIS” and “NON RECOURSE 
FINANCING” key words may be attributable to the fact that “ADJUSTED BASIS: partner’s 
interest” was the most general of all the relevant key words, and it was the first relevant key word 
listed in the index. In contrast, “NON RECOURSE FINANCING: partnership liabilities” was one 
of the more specific relevant key words, and it was listed deeper in the index. It may be that time 
pressure leads to less sequential searches that focus on more specific or deeper aspects of the 
problem. However, since this was not tested, no conclusions can be drawn. 

In terms of irrelevant key words selected, all three groups made the mistake of selecting key 
words that did not refer to the basis of the partnership interest but rather to the basis of some other 
asset. For example, the most frequently selected irrelevant key word by all three groups was 
“ADJUSTED BASIS: Debt financed property.” This key word is not relevant, because it refers 
to the basis of the asset that was financed by debt and not to the basis of the partnership interest. 
Another common mistake was that subjects selected key words that were too broad in scope to 
relate directly to the issue described in the case. An example of such a key word that was among 
the most frequently selected key words by all three knowledge groups was “BASIS: assumption 
of liabilities.” 


V. SUMMARY AND CONCLUSIONS 


This study explores the effects of time pressure and knowledge on tax researcher performance 
by examining how these variables affect the ability of tax researchers to select relevant key words 
in a time-restricted task. The results are consistent with the predictions, with the exception of the 
prediction for the main effect of time pressure. While time pressure was predicted to have a 
positive effect on all three subject groups, only the procedural group responded to time pressure 
by significantly increasing the number of relevant key words selected. Tests of the other 
hypotheses support the predictions that both declarative knowledge and procedural knowledge 
in the applicable domain are important for selecting relevant key words and that subjects with 
significant procedural knowledge are more successful at adapting to increasing time pressure than 
subjects lacking this knowledge. This paper also presents descriptive results consistent with 
findings in other disciplines that novices attend to surface features of the problem, while more 
knowledgeable individuals give consideration to the problem’s deeper aspects (e.g., Chi et al. 
1981). 
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This study’s primary contribution is that it presents initial empirical evidence that time 
pressure and knowledge interact. The results indicate that procedural knowledge facilitates 
adaptation to time pressure and support previously untested claims that more knowledgeable 
accountants should adapt to increasing time pressure better than less knowledgeable accountants 
(Brown and Solomon 1992; Gibbins 1984). Empirical evidence supporting a time pressure and 
knowledge interaction provides anew perspective on studies that investigate time pressure effects 
and knowledge effects separately (e.g., Bonner and Lewis 1990; McDaniel 1990). Further, the 
idea that time pressure and knowledge interact may have implications for practice in terms of 
staffing assignments. 

The findings that performance improves with the acquisition of declarative knowledge and 
that it further improves with the development of procedural knowledge provide empirical support 
for descriptive papers that discuss the process of accounting expertise development (e.g., Gibbins 
1988). The results provide insights that may be useful for developing tax courses and training 
programs designed to increase tax researcher performance. 

In contrast to traditional time pressure studies that manipulate time pressure bys varying the 
amount of time the subject has available to perform the task (e.g., Brown and Solomon 1992; 
McDaniel 1990; Ben Zur and Breznitz 1981), this study held time constant and varied subjects’ 
perceptions of time pressure. As a result, the research design used in this study addresses the 
question of how time pressure affects performance within a given time period, as opposed to 
questions relating to how time pressure affects average performance over different time periods. 

While the issues described in this study have been addressed from a tax perspective, they are 
likely to apply to other accounting settings as well. For example, financial accountants and 
auditors search financial accounting authorities for relevant authority to determine appropriate 
accounting treatments for various transactions (e.g., Salterio 1994). With the increased presence 
of accounting information on CD-ROM and improvements in communications technology, 
auditors are beginning to search the growing database of financial accounting authority early in 
their careers. 

Limitations of this study point to potential areas for future research. For example, in this study 
no distinctions were made between relevant key words. Future research may address the extent 
to which selected key words are likely to lead to complete solutions. Also, the experimental task 
required subjects to selectkey words that were likely to refer them to relevant authority rather than 
to select the actual authority. Future research may focus on the researcher’s ability to find actual 
tax authority (e.g., see Cloyd 1994). Further, this study did not directly measure the extent of 
subjects’ procedural knowledge. Rather, it used a group of subjects who, on the basis of their 
experiences, were presumed to have developed significant procedural knowledge for the task. 
Future research may be directed toward developing metrics to more readily ascertain the extent 
of a subject’s procedural knowledge (e.g., see Bonner and Walker 1994; Bonner et al. 1992). 
Finally, this study was not designed to identify the strategies tax researchers use to search for 
relevant authority. Future research should examine how search strategies change with the 
development of procedural knowledge and how search strategies are affected by time pressure 
(e.g., see Spilker and Prawitt 1994). 


66 The Accounting Review, January 1995 


APPENDIX A 
Partnership case: Problem statement and relevant key words 


Panel A: Problem Statement 
Facts: 


In late 1990, Nicole struck it rich in the California State Lottery. In January, 1991, 
she used some of her winnings to join Clark and Kelsey in forming the Aldia Limited 
Partnership to deal in real estate activities. Nicole, Clark, and Kelsey each contributed 
cash to the partnership. For their cash contributions, Nicole received a 50% interest as 
a general partner, Clark received a 30% interest as a general partner, and Kelsey received 
a 20% interest as a limited partner. 


In March of 1991, Aldia acquired a parcel of land with a fair market value of 
$800,000 from an unrelated party by paying $300,000 cash and taking the property 
subject to a mortgage of $500,000, for which the partnership assumed personal liability. 
By December 31, 1991, the value of the property had appreciated to $900,000. In 
November 1991, Aldia purchased a small adjoining lot to use for parking. The lot cost 
$120,000, of which Aldia paid $20,000 cash and borrowed the remaining $100,000, 
securing it with the mortgage on the smaller lot. The mortgage note provides that in the 
event of default, the holder’s only remedy is to foreclose on the property. However, to 
induce the lender to make the loan, Nicole personally guaranteed $10,000 of loan 
principal. Aldia made no principal payments during 1991 (in accordance with the 
mortgage agreements). 


Issue: 


What effect, if any, do the mortgages have on Nicole’s federal income tax basis in 
her partnership interest? 


Panel B: Relevant Key Words* 


ADJUSTED BASIS: 
partner's interest 
ALLOCATION: 
partnership liabilities 
BASIS: 
partnership liabilities: increase to 
NON RECOURSE FINANCING: 
partnership liabilities 
PARTNERS AND PARTNERSHIPS: 
basis of partner's interest: liabilities treated as distributions or contributions 
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basis of partner's interest: loans affecting 

basis of partner's interest: non recourse loan 

liabilities: basis increase 

liabilities: sharing by partner's overview 

liabilities as part of basis 

liabilities, treatmenx as distributions or contributions: allocation among partners 

liabilities, treatment as distributions or contributions: basis, effect on 

liabilities, treatment as distributions or contributions: increase in 

liabilities, treatment as distributions or contributions: non recourse liabilities 

liabilities, treatment as distributions or contributions: non recourse loans and guarantees 
by partners 

liabilities, treatment as distributions or contributions: partners loans and guarantees 

liabilities, treatment as distributions or contributions: recourse liabilities 

non recourse liabilities 

recourse liabilities, sharing of: capital accounts deficit, effect of 


* The key words are organized as follows: 
MAIN HEADINGS: 
first-level key words: second-level key words. 


APPENDIX B 
Control case: Problem statement and relevant key words 


Panel A: Problem Statement 
Facts: 


Ceramco, Inc. is a U.S. corporation engaged in the manufacture of ceramic tile for 
sale in both the U.S. and abroad. Since its formation in 1987, Ceramco has been quite 
profitable. In late 1990, Ceramco decided that it needed a stronger presence in Europe 
to take advantage of the weak competition in a number of European countries. In January 
of 1991, Ceramco formed a wholly owned Irish corporation (“Forco”). Ceramco chose 
to locate its new subsidiary in Ireland because the Irish government promised Ceramco 
that Forco's profits would only be subject to a 1596 tax rate. 


Forco purchases, for cash, raw tile from Ceramco at its current market price. After 
cutting the tile into various shapes, Forco paints it and packages it for sale. Forco then 
sells the tile to customers throughout Europe. During 1991, Forco did quite well, 
recording a pre-tax profit of $12 million (in U.S. dollars). However, besides the payments 
for the raw tile, Forco made no cash payments to Ceramco during 1991. 


Issue: 


Assuming no transfer pricing problems, what effect, if any, does Forco's profits 
have on Ceramco's 1991 U.S. federal income tax liability? 
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Panel B: Relevant Key Words* 


CONTROLLED FOREIGN CORPORATIONS: 


foreign tax credit, special rules: gross up of amounts 
foreign taxes deemed paid 
gross income of U.S. Shareholders, inclusions: foreign base company income defined 
shareholders: subpart F income taxed to 
subpart F income: current earnings and profits, limited to 
subpart F income: defined 
U.S. shareholders: subpart F income 
DIVIDEND INCOME, CORPORATIONS: 


foreign corporation payor: gross up of dividends, foreign tax credit computation 
DIVIDENDS: 

controlled foreign corporations: earnings and profits 
EARNINGS AND PROFITS AVAILABLE FOR DIVIDENDS: 

controlled foreign corporations: adjustments 
FOREIGN BASE COMPANY INCOME: 


calculation of income 

exclusions from 

manufacturing 

other facts and circumstances tests 


FOREIGN CORPORATIONS: 
foreign tax credit: deemed paid credit 
FOREIGN TAX CREDIT: 
deemed paid credit: deemed dividend distribution 
limitation: look-through rules, controlled foreign corporations 
subpart F income 
IRELAND: 
tax treaty 
SUBPART F INCOME: 


deemed distribution of 
foreign tax credit for shareholders taxed 
inclusion as 

* The key words are organized as follows: 


MAIN HEADINGS: 
first-level key words: second-level key words. 
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ABSTRACT: In auditing an account balance, an auditor must first assess the 
inherent risk that the balance is misrepresented. This paper identifles factors 
determining the accuracy of the auditors inherent risk assessment in a hidden 
information audit setting, under the assumption that both players construct expec- 
tations through a process of rational inference known as rationalization. The 
accuracy of the auditor's risk assessment Is shown to be influenced by the risk of 
unintentional errors, the players’ incentives, the precision of the auditor's data, and 
regulatory bounds on detection risk. 


Key Words: Auditing, Decision theory, Game theory, Rationality 


I. INTRODUCTION 


ecision theoretic audit models characterize the auditor's optimal choice of detection risk 
I) given that the auditor accurately assesses the exogenous level of inherent risk (Kinney 

1975).! Strategic audit models characterize the auditor’ s optimal choice of detection risk 
given that the auditor accurately assesses the level of inherent risk which is chosen privately by 
a manager (Fellingham and Newman 1985, Newman and Noel 1989, Shibano 1990, Alles et al. 
1993 among others). However, it is difficult to know how the auditor can predict the manager’s 
behavior accurately (Fellingham and Newman 1985). Inaccurate predictions willlead the auditor 
to reject too many correct balances (an inefficient audit) or accept too many incorrect balances 
(an ineffective audit). Thus, the shift from a decision theoretic setting to a strategic setting raises 
an important question: in what circumstances can the auditor predict the level of inherent risk 
chosen privately by the manager? 


! Inherent risk is the probability that an error is introduced into an account (either intentionally or unintentionally), and 
detection risk is the probability that an auditor accepts an account balance, given that it is incorrect. Audit risk is the 
probability that a balance is incorrect and is accepted by the auditor (SAS 47, AICPA 1984). 
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To address this question, this paper assumes that players reason about each other's decision 
problems by applying a method of rational inference called rationalization (Bernheim 1984, 
Pearce 1984). The analysis shows that the auditor's ability to predict the manager's action is . 
negatively related to the degree of strategic dependence—the degree to which a change in the 
auditor’s expectation of the manager's action affects the manager's action, assuming that the 
manager responds optimally to the auditor's action. Strategic dependence is generally low if both 
players' optimal strategies are relatively insensitive to their own expectations, or if many 
expectations lead the auditor to choose such an extreme level of detection risk that the manager 
prefers to misrepresent always (or never). 

The analysis also shows how specific characteristics of the audit setting can alter strategic 
dependence: strategic dependence is decreased by increasing the probability of unintentional 
account balance errors; by penalizing the auditor severely for incorrect acceptance when the 
probability of unintentional error is high; by penalizing the auditor less severely when the 
probability of unintentional error is low; by increasing the quality of the auditor's data 
technology; and by imposing a maximum acceptable level of detection risk, regardless of the 
auditor's assessment of inherent risk as required by SAS 47 (AICPA 1988). These findings 
provide new insight into how audit policies and regulations might improve the optimality of 
auditor behavior, and when Nash equilibrium outcomes will (and will not) have predictive power. 

The remainder of the paperis organized as follows. Section II presents a model of the auditing 
environment, similar to Shibano's (1990) “hidden information" model. Section III introduces 
rationalization and the notion of strategic dependence. Section IV examines how changes in the 
audit environment can influence the degree of strategic dependence. Section V concludes. 


IL THE AUDIT MODEL 


This section presents a simple strategic audit model in which an auditor tests a manager's 
representation of a single account. The model differs from Shibano's (1990) hidden information 
model in two ways. First, the account balance may be represented incorrectly either through the 
manager's intentional misrepresentation or through an unintentional error. Second, as in Alles et 
al. (1993), the auditor has imperfect information on the manager's incentive to misrepresent. To 
highlight the fact that the game involves two separate but related optimization problems, the 
exposition presents separately the decisions facing the audit and manager. 


Strategies 


A manager is endowed with a true account balance ©,, which takes on a high value ©, with 
probability p or a low value ©, with probability 1-p. The value ©, is interpreted as the “more 
favorable" of the two, as if it were a higher balance in an asset account. The accounting system 
does not necessarily reveal the true balance; instead, the generated balance emerging from the 
accounting system may overstate the true balance because of erroneous bookkeeping entries. 
Specifically, given a true balance 6, the generated balance takes on the value Ə, with proba- 
bility 1 - 1j, and takes on value @,, with probability Tl. A lower rate of error y indicates a stronger 
internal control system. It is assumed the manager investigates whenever the accounting system 
reports a low balance, and corrects any errors; thus, the generated balance never understates the 
true balance? After learning the generated balance, the manager may represent the balance as 
high or low, with these represented balances being denoted Gand ©,. For simplicity, the 


? Allowing unintentional errors to understate as well as overstate the true balance causes only minor changes in the 
substantive results, but significantly complicates the exposition, 
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manager's misrepresentation is assumed not to be corrected by the control system (i.e., control 
risk = 1). 

The auditor tests the validity of the representation by conducting substantive tests (collecting 
sample data) which generate a signal z. The signal is distributed normally with finite variance 
o? > 0. The mean of the distribution is 0 if O, = O, and is 4 > 0 otherwise. Both © and pi are 
exogenous. After learning z, the auditor recommends either that the audit firm attest to the 
manager's representation in the final audit report (accept, denoted “a”) or that the audit firm 
conduct further costly testing which is likely to reveal the true balance (reject, denoted “Y’”). 

A timeline of events is shown in figure 1. 


The Auditor’s Decision Problem 


Let U(di Ô, ,8;) be the payoff to the auditor of making decision d e (a, r} when the mana- 
ger’s representation is €, and the true balance is 9}. It is assumed that the auditor is penalized 
for incorrectly accepting or rejecting high representations, but has an incentive always to accept 
low representations.) This assumption is formalized as 


U(al B 9; )- U(rl 6,,0,) =B<0, (A1) 
U(r184,04)—-U(a184,05)- «0, and 
U(alÓ, ,) » U(rlÓ, ,), 


where à and D represent the penalties to incorrect rejection and incorrect acceptance. The 
“liability ratio" X = f/a reflects the relative magnitude of these penalties. The auditor’s optimal 
strategy is to accept a high representation (G,)if the probability that the balance is truly high 1s 
sufficiently high relative to the liability ratio, and reject otherwise. Because the probability of a 
high true balance increases as z increases, an auditor preferring to accept (reject) a representation 
for one value of z will necessarily prefer to accept (reject) it for a higher (lower) value of z. Thus, 
it is possible to identify a cutoff level c corresponding to the value of z for which the manager 
is indifferent between accepting and rejecting. (See Newman and Noel (1989); Shibano (1990); 
and Alles et al. (1993) for formal proofs.) This cutoff value uniquely determines the probability 
thatthe auditor accepts a high reported balance given that the true balance is low. This probability, 
which corresponds to detection risk in the standard audit risk model of SAS 47 (AICPA 1984), 
is computed as 


6- Pr(al 84,0, ) 2 Pr(z» cl 04) (1) 


> All payoffs are assumed to be determined by unmodeled events, as in most strategic audit models (Fellingham and 
Newman 1985, Newman and Noel 1989, Shibano 1990, Alles et al. 1993). Motivations for the various payoffs are 
provided in these papers. Antle (1989) and Watts (1990) discuss potential problems with this modeling strategy. 


FIGURE 1 
Timeline of Events 
t-1 2 3 4 5 
Nature Manager Manager Auditor Auditor 
Chooses Leams Chooses Learns Chooses 
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For the later analysis, it will be helpful to have a representation of the auditor’s optimal choice 
of detection risk 6 as a function of the probability t that the manager reports a high balance when 
in fact the generated balance is low (i.e., t represents the probability that the manager intentionally 
misrepresents alow balance). The following proposition shows that this “best response” function 
6(T) decreases in T. 


Proposition 1.1. The auditor's optimal choice of detection risk falls as t rises; that is, 
(T) « O. 
Proof. See Appendix for all proofs. 


Because the manager's choice of t determines the inherent risk facing the auditor (the 
probability that a balance represented as high is truly low), the proposition is consistent with SAS 
47, which asserts that the auditor should choose lower levels of detection risk for higher 
assessments of inherent risk.* Note that the auditor wishes to accept some reports even if the 
manager always misrepresents (T = 1), because some balances are truly high. The auditor also 
wishes to reject some reports if the manager never misrepresents (t = 0), because some balances 
are misstated by unintentional error. Therefore, &(1) is strictly greater than 0, while 60) is strictly 
less than 1. 


The Manager's Decision Problem 


The manager's incentives are characterized by three assumptions. First, the manager always 
prefers to represent the balance as high if the generated balance is high.? Letting V( Ô, le pOr d 
be the utility to the manager making representation O; given generated balance ©}, true balance 
a. and auditor decision d, the assumption is 


V(84164,9,,d) > VO, 165,0,,d), for k e IL, H}, de fa, r}. (A2) 


Second, the manager’s incentive to misrepresent account balances varies with circumstances 
which are not entirely known by the auditor (SAS 53, AICPA, 1988). The model captures this 
variation by allowing there to be a variety of manager “types.” Let V,(t)= V(Óg | & , 6, .4, t) 
be the payoff to the manager of type t for having the auditor accept a misrepresentation. For 
simplicity, V ,(t) is assumed to decrease linearly in t, according to the equation V,(t) = w - vt, 
where v is a positive constant. The manager knows t with certainty; the auditor knows only that 
tis drawn from a uniform probability distribution over the interval [0, 1].5 V a(t) is assumed to 
be the only aspect of the manager's payoff which depends on the manager's type. 


V4 (t) = V(8416,,0,,2,t) = w— vt. (A3) 


The final assumption places bounds on the payoff to having a low representation accepted, 
denoted V; = V(9; 10;,0, , a). This payoff lies below the lowest possible payoff V ,(1) to 


* “Detection risk should bear an inverse relationship to inherent and control risk. The less the inherent and control risk 
the auditor believes exists, the greater the detection risk he can accept." (SAS 47, Sec. 21) 

> This assumption differs from assumption (A6) of Shibano (1990), which permits the manager with a high balance to 
prefer reporting a low balance under certain beliefs. Although the equilibrium is the same under either assumption, the 
present assumption simplifies the examination of rational nonequilibrium behavior. 

© The assumption of a uniform distribution is without loss of generality if V , (t) is unrestricted (other than the requirement 
that it be monotonically decreasing in t). The restriction of V , (t) to a linear ion is helpful in simplifying the analysis, 
but is not critical to the results. 
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having a misrepresentation accepted, but lies above the payoff to having a misrepresentation 
rejected, denoted V, = V(8,16,,0,, r). This assumption reflects both the incentive to 
misrepresent and the penalties to having a misrepresentation detected, which may arise from legal 
ramifications, actions by superiors, or reputation loss. 


V,(1) > V, > V,. (A4) 


Given these assumptions, the manager’s optimal strategy is to misrepresent a low generated 
balance as a high balance whenever the payoff to an accepted misrepresentation is sufficiently 
great relative to the likelihood that the misrepresentation will be rejected, and to represent 
honestly otherwise. Because the payoff to an accepted misrepresentation decreases as t rises, the 
manager can identify a critical type t such that the manager's optimal strategy is to misrepresent 
if t < t, and to represent honestly otherwise. Because t is distributed uniformly over [0, 1], the 
probability that the manager misrepresents is simply equal to t, making it possible to use T to 
represent both the critical type and the probability of intentional misrepresentation (as in 
Proposition 1.1). 

The following proposition demonstrates that as the auditor' s choice of detection risk rises, 
the manager's optimal strategy t(9) also rises. 


Proposition 1.2. Let t (1) = Min{6 | t(8) = 1) and let t(0) = Max(61t(6) = 0). Then for 
7t 1(1) > 82 t(0), the optimal rate of misrepresentation rises as 6 rises; that is, 1'(6) > 0. 


Because the incentive to misrepresent is finite (V,(t) < 29), there is always some level of 
detection risk which is so low that no manager type wishes to misrepresent; the highest such level 
is denoted t! (0) > 0. Because the incentive to misrepresent is positive (V ,(t) > V, ), there is a level 
of detection risk which is so high that all manager types wish to misrepresent; the lowest such level 
is denoted t (1) < 1. 


IM. STRATEGIC DEPENDENCE 


Developing Expectations 


The previous section characterizes the auditor’s optimal choice of detection risk given an 
expectation of the manager’s choice of inherent risk; however, the analysis provides no indication 
of how the auditor might form such an expectation. Decision theoretic models (Kinney 1975) take 
the rate of inherent risk to be entirely exogenous, so it is reasonable to assume that the auditor 
knows that rate (just as the auditor knows all other exogenous parameters). In strategic models 
like the one described here, however, the auditor must somehow predict the manager’s solution 
to manager’s decision problem, knowing that the manager is simultaneously attempting to predict 
the auditor’s solution to the auditor’s decision problem. 

Fellingham and Newman (1985) recognized the difficulty of forming accurate expectations 
about another player’s action.’ Instead of simply assuming that the players are able to form these 


7 “(IF the auditor has correct conjectures regarding client behavior, decision theory will lead to an optimal auditor 
decision. The method by which the auditor obtains these correct conjectures is problematic” (Fellingham and Newman 
1985, 647). 
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expectations accurately (as is assumed in previous analyses), this paper assumes only that both 
players construct their predictions of the other player's actions from a process of rational 
inference, and attempts to answer the following question: Zn what circumstances can the auditor 
predict the manager's choice of inherent risk?* 

Specifically, both players are assumed to be rational in the sense of Savage (1954), always 
choosing the optimal response to their expectations of the other player's strategy. Both players 
also know that both players are rational; that both players’ know that both players are rational; 
and soon. Giventhis knowledge, the players can refine their expectations of each other's strategy, 
by ruling out irrational strategies through the iterated deletion of dominated strategies. This 
reasoning process will be referred to as rationalization (Bernheim 1984; Pearce 1984). 

The analysis shows that in some cases even a few iterations of rationalization allow the 
auditor to predict the manager's strategy exactly, and allow the manager to know that the auditor 
can do so. There is only one outcome for which this is possible: the Nash equilibrium (t*, 5*), 
determined by the intersection of the best response functions t(0) and Ó(t). Thus, in some 
circumstances the analysis justifies the assumption of Nash equilibrium behavior central to 
previous strategic models.'? In other cases, however, even an infinite number of iterations of 
rationalization will not restrict the strategies the auditor can rationally expect (and the manager 
can rationally choose). In these cases, the auditor is unlikely to predict the manager's strategy 
accurately. For the same reason, a model assuming Nash equilibrium behavior is unlikely to 
predict either player's strategy accurately.!! 


Formalization 


Let T = [0, 1] represent the range of critical types t the manager can choose and let D = [0, 
1] represent the range of detection risk levels Sthe auditor can choose. For any subset T^ con- 
tained in T, B(T) is the set of all of the auditor's strategies which are best responses to any of the 
manager's strategies within T". Similarly, for any subset D’ contained in D, B(D’) is the set of all 
of the manager's strategies which are best responses to any of the auditor's strategies within D’. 
That is, 
B(T) = {8 e D: ô = &t)for some t e T^, T CT), (2) 
B(D)- {T e T: t= 1(5)for some 8€ D’, D' C D}. 


It is convenient to use the notation B(B()) = B?^(), B(B(B()) = B?(), and so on. Note that the 
function B() sometimes denotes the set of the manager's best responses to a set of auditor 
strategies, and sometimes denotes the set of the auditor's best responses to a set of manager 
strategies. The correct interpretation is always determined by the argument of the function. 


* One could ask a similar question about the circumstances in which the manager can predict the auditor's choice of 
detection risk. The nature of the analysis makes it clear that the answer must be the same to both questions. This paper 
addresses only the auditor's prediction, as that prediction is of more direct interest to accounting researchers, auditors 
and regulators. i 

? Inatwo-playertwo-strategy game rationalization and the iterated deletion of dominated strategies are identical; in more 
complex games, this identity may not be satisfied. 

10 The proof that such an outcome exists and is unique in the present model is nearly identical to that provided by Shibano 
(1990) and Newman and Noel (1991). 

I! The results of the analysis are similar to those derived from apparently dissimilar processes, such as learning models 
based on fictitious play (Brown 1951), behaviorist psychology (Herrnstein and Vaughn 1980), or evolutionary models 
of adaptive behavior (Hofbauer and Sigmund 1988). Circumstances in which rationalization yields (does not yield) 
Nash equilibrium behavior are typically also circumstances in which these other processes yield (do not yield) Nash 
equilibrium behavior (see also Moulin 1984). In addition, the predictions of rationalization have considerable 
experimental support. Results from Bloomfield (1993), and O'Neill (1987) indicate that Nash equilibria may not be 
easily achieved when not predicted by rationalization. 
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If both players are rational (as defined above), then the auditor chooses a strategy from D, 
= B(T) and the manager chooses a strategy from T, = B(D). If both players know that both players 
are rational, then the auditor chooses from the set D, = B(B(D)) = BCT) and the manager chooses 
from the set T, = B(B(T)) = B(Dj.. This process of inference is referred to as first-order 
rationalization, and is the first step in considering the other player's decision problem. General- 
izing ton21,let T,=[t,,7,] and D,- [5,,5,] represent the interval sets of manager and auditor 
strategies consistent with n? order rationalization. (Because T and D are intervals, the monoto- 
nicity of &(t) and t(5) guarantee that T, and D, are intervals also.) These sets can be constructed 
iteratively according to the definition 


T, = BO, ) = [B6,), B(91)], (3) 
D, = B(T, ) = [B(t,), B(1,)]. 


The set T, represents the set of strategies which the manager could rationally choose given 
thatthe players enduren iterations of rationalization. Measure the range of this setas | T, |= t, — Tp 
In some cases IT,| = 0 because T, consists of a single point. This must be the Nash equilibrium 
strategy for the manager T*—it is the only point for which B(B(t)) = t. As IT | increases, the 
auditor's prediction of the manager's strategy is apt to become less accurate. Also, the players' 
choices are more likely to deviate significantly from the Nash equilibrium outcome. Thus, 
inherent risk assessments are likely to be accurate (and the Nash equilibrium outcome is bis 
to have predictive power) only when T, is narrow. 


Optimality and the Accuracy of Expectations 


Using the process of rationalization, the auditor will not expect to confront any manager 
strategies lying outside the interval T. However, the process provides no guidance for which of 
the strategies within T, the auditor should expect. Assume that the auditor expects (and the 
manager chooses) an arbitrary one of those strategies, and that the auditor then chooses the level 
of detection risk which is optimal given that expectation. Because that arbitrary expectation may 
differ from the level of inherent risk that the manager actually chooses, the auditor may deviate 
from the "ex post optimal" level of detection risk (the level of detection risk that is optimal given 
the manager's actual choice of inherent risk). Furthermore, one can measure the ex postoptimality 
of the auditor' s choice by comparing the payoff to that choice to the payoff to the ex post optimal 
choice. 

By the nature of the auditor's payoff function, the ex post optimality of the auditor’s choice, 
will be a decreasing function of the distance between the auditor’s actual choice and the ex post 
optimal choice. While computing an "expected" level of ex post optimality is beyond the scope 
of this paper, '? itis clear that as the interval T, becomes wider, the actual and ex post optimal levels 
of detection risk are likely to be more widely separated, and the degree of ex post optimality is 
likely to decline. 


IT | and Strategic Dependence 


To understand how game parameters can affect IT |, it is helpful to characterize IT i as the 
product of the average slopes of the players’ best response functions. Let 6'(T^) be the average 
slope of 5(t) over the interval T" c; T, and define 1 (D^) similarly. The auditor expects the manager 


2This would require computing the probabilities of various outcomes not eliminated by rationalization. However, 
rationalization provides no means for such a computation, 
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to choose a misrepresentation rate within a range of IT! = 1. Consequently, the auditor chooses a 
strategy from the interval D, with length ID, = - 9'(T) ITI. A manager knowing that the auditor 
expects a strategy within T (one level of rationalization) chooses a strategy from the interval T, 
with range IT,| = 1(D,) ID. Combining these equations yields 


IT, =- 8(T) 1 (Dj) ITI - A, ITI. (4) 


The product A, serves as a measure of the strategic dependence of the audit setting, and reflects 
the extent to which variations in the auditor’s expectation of the manager's strategy change the 
manager's strategy (assuming the manager responds optimally to the auditor's optimal strategy). 
The greater the measure of strategic dependence, the greater IT | must be. Generalizing this logic 
to n » 1 yields - 


T=- &(T, ) Y (D, ) IT, | A, IT, J| (5) 


where A, may be said to measure the "n? order” of strategic dependence. Iterating this process, 
and recognizing that [T] = 1 yields 


ZA (-D/2 
17, l= [[ A; for n even, and! T, i= []A,,,, forn odd. (6) 
ícÜ i=Q 

This construction of IT_| shows that if the slopes of &(t) and 1(8) are flat, then inherent risk 
assessments will be relatively accurate (because the T, will be narrow intervals). Intuitively, if the 
slope of the auditor’s best response function is relatively flat, then the auditor’s choice is 
determined primarily by the exogenous parameters of the game, and only slightly by the auditor’s 
expectation of the manager’s behavior. Consequently, the manager’s task approximates a simple 
one-person decision problem, with the accuracy of this approximation increasing as the slope of 
the manager’s best response function becomes flatter. If the slopes of both best response functions 
are sufficiently flat, the auditor can predict the manager’s behavior with considerable accuracy. 

For example, in the game of figure 2 the small slope of 8(t) guarantees that the auditor chooses 
a strategy within a narrow range regardless of his or her expectation. The small slope of 1(8) 
guarantees that this small range of auditor strategies will induce an even smaller range of manager 
strategies. The dashed line illustrates the effect of repeated iterations by plotting the sequence 
9(0), B(6(0)), B7(8(0)), B*(6(0)), etc. By equation (3), this sequence describes the upper bound 
of D, the upper bound of T,, the lower bound of D,, the lower bound of T, and so on. The inward 
spiral indicates that each iteration of rationalization eliminates additional strategies. 

It can be difficult to determine whether infinite iterations of this process will cause T, to 
converge tot *, because the nonlinear nature of the best response functions make it possible that 
A, converges to 1. (In this case IT | converges to a strictly positive number.) However, the 
propositions to follow are able to identify conditions under which parameter changes reduce IT | 
for all finite n 2 1. Such a result implies that the parameter change increases the accuracy of the 
auditor's inherent risk assessment regardless of the number of iterations of rationalization the 
players apply. 

If the slopes of the best response function are very steep, rationalization may be unable to 
eliminate even a single manager strategy. In the game of figure 3, an auditor expecting a high rate 
of misrepresentation selects such a low level of detection risk that the manager's optimal strategy 
is to represent honestly always. An auditorexpecting alow rate of misrepresentation chooses such 
a high level of detection risk that the manager's optimal response is to misrepresent always. As 
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FIGURE 2 
A game with low strategic dependence 
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The function 8(t) depicts the auditor’ s optimal choice of detection risk for every probability 
tof intentional misrepresentation by the manager. The function 1(8) depicts the manager’ s 
optimal probability of intentional misrepresentation for every level 5 of detection risk. Because 
the slopes of these functions are relatively flat, the boundaries of the sets T, spiral inward toward 
the Nash equilibrium (t*, 5*) with successive levels of rationalization, as shown by the dashed 
line. 


a result, the manager can rationalize any strategy as being a best response to some rational 
expectation of the auditor (IT,| = A, = 1). The same argument shows that further iterations of 
rationalization are also ineffective, so that IT.| 2 A, = 1 for all n. In this setting the auditor is likely 
to deviate considerably from the ex post optimal choice of detection risk. 

Recall that the slope of 1(5) is zero outside the interval T! = ('(0), t'(1)). Therefore, the 
construction of IT l in (6) shows that strategic dependence depends not only on the slopes of the 
best response functions, but also on the position of D, = [6(1), &(0)] relative to the interval T". 
In the game of figure 4, for example, the slope of 6(1)is the same as in figure 3, while the slope 
of t(5) is even steeper. However, T" is shifted downward enough that the manager prefers to 
misrepresent for any level of detection risk chosen by a rational auditor: 6(0) > t!(0). In this case, 
the slope of 1(5) is zero over the domain D, so A, = 0 and one iteration of rationalization reduces 
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FIGURE 3 


A game with high strategic dependence 
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Because the slopes of the best response functions are very steep, the auditor’s highest rational 
level of detection risk induces the manager to misrepresent always (6(0) > t7'(1)), while the 
auditor's lowest rational level of detection risk induces the manager to misrepresent never (9(1) 
< t (0)). As a result, the auditor can expect (and the manager can choose) any probability of 
intentional misrepresentation. 


the game into a simple decision problem for each player. The players therefore choose the Nash 
equilibrium strategy as long as both players know that both players are rational. This case also 
arises when a rational manager always prefers to report honestlv given every rational choice of 
detection risk. 


IV. AUDIT CHARACTERISTICS AFFECTING STRATEGIC DEPENDENCE 


This section explores how strategic dependence (and therefore the ability of the auditor to 
predict the manager's strategy) is affected by characteristics of the audit setting: the risk of 
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FIGURE 4 
A game with zero strategic dependence, even though the slopes of the 
best response functions are steep 
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The slopes of the best response functions 9(t) and 1(0) are even steeper in this game than in 
the game of figure 3; however, 1(95) has been increased for all 5, so that any rational choice of 
detection risk by the auditor induces the manager to misrepresent always. As a result, the auditor 
can know for certain that the manager misrepresents always. 


unintentional errors; the auditor's incentives; the precision of the auditor's data; the manager's 
incentives; and exogenous restrictions on the auditor’s choice of detection risk. All of these results 
are driven by the effects of the parameter changes on the slopes and levels of the best response 
functions. 


Risk of Unintentional Errors 
The inherent risk of a misrepresentation is partly determined by the risk that the balance 


generated by the accounting system is unintentionally inflated; this risk is denoted? A 5 
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- M epee Lid 2 HR (7) 
R=Pr(©, 164) Mere 


Whenever the balance generated by the accounting system inflates the true balance; the 
balance is misrepresented regardless of the manager’s choice of t. Therefore, the manager’s 
strategy affects the auditor’s decision problem only to the extent that the accounting system 
reports balances accurately. If the risk of unintentional error is sufficiently high, the rate of 
inherent risk varies little with the manager’s strategy, so that the game approximates a one-person 
decision problem for the auditor. Consequently, the degree of strategic dependence is reduced, 
and the auditor is able to predict the manager’s strategy more accurately (and attain a greater level 
of ex post optimality). 


Proposition 3.1. There exists a value R^ such that forallR>R’‘,either T,=t*=0, orincreas- 
es in R reduce IT | for all n 2 1. 


If the risk of unintentional error is very high, then the auditor will prefer to reject frequently 
(choose a low level of detection risk), even when expecting very little intentional misrepresen- 
tation. As aresult, the manager who knows the auditor is rational will prefer to represent honestly 
unless the incentive to successful misrepresentation is extremely high. Increasing the rate of 
unintentional error increases the incentive needed for the manager to be able to justify misrepre- 
sentation, and therefore reduces the strategic dependence of the audit (unless the rate of 
unintentional error is already so high that strategic dependence has been eliminated). 

This proposition supports the general intuition that unintentional errors reduce the strategic 
characteristics of an audit, and make it easier for the auditor to predict the manager's strategy. 
Note, however, that the proposition says nothing about the degree of strategic dependence when 
the risk of unintentional error is low. Strategic dependence in that case depends on other 
parameters, which are discussed below. 


Auditor Liability 


The auditor's incentives can be characterized by the liability ratio B/a—the ratio of the 
penalty for incorrect acceptance to the penalty for incorrect rejection. As in Alles et al. (1993), 
a higher liability ratio reduces the auditor' s optimal level of detection risk for any expectation, 
and consequently reduces the Nash equilibrium levels of misrepresentation, detection risk and 
audit risk. The following proposition shows that the liability ratio also affects the strategic 
dependence of the audit setting. 


Proposition 3.2. There exist liability ratios A” and à”, with A^ < A^, such that 
(a) if à< A’, either T, = t* = 1 or decreases in A cause IT | to decrease; 
(b) if A> 4°’, either T, = t* = 0 or increases in 2, cause IT | to decrease. 


Both increases in high ratios and decreases in low ratios can reduce strategic dependence. 
Both results are driven by the effect of the liability ratio on the endpoints of D, If the liability ratio 
becomes extremely low, then D, shifts upward so far that any expectation leads the auditor to 
choose such a high level of detection risk that the manager’s best response is to misrepresent 
always, as in figure 4. If the liability ratio becomes very high, then D, shifts downward so far that 
any expectation leads the auditor to choose such a low level of detection risk that the manager's 
best response is to report honestly always. In both cases, varying the auditor's expectation has no 
effect on the manager's optimal response, so strategic dependence is zero. The proposition also 
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establishes that continuous changes in A which move the game toward these extreme cases cause 
_the IT. to decrease continuously. 


Auditor Liability When Unintentional Errors are Rare. 


Because there appear to be natural limits to the penalties for incorrect acceptance and 
rejection, itis unclear what circumstances allow the extreme liability ratios of Proposition 3.2 to 
be feasible. The following proposition sheds some light on this issue by showing that as the rate 
of unintentional-error goes to 0, any finite liability ratio falls in the lower region (A < A’), so that 
increasing the ratio can increase (but never decrease) strategic dependence. 


Proposition 3.3. Let A" be as defined in Proposition 3.2. Then as R 0, A” — œ, 


Increasing the auditor's liability ratio lowers the auditor’s highest rational level of detection 
risk. This can reduce or eliminate strategic dependence if the highest rational level of detection 
risk becomes low enough forthe managerto prefer representing honestly given mostor all rational 
auditor strategies. However, as the risk of unintentional error approaches zero, the auditor's 
highest rational level of detection risk approaches 1 regardless of the liability ratio (because the 
auditor can rationally expect that no balances are misrepresented). In this case, increasing the 
auditor' s liability ratio serves only to make the auditor's best response function extremely steep, 
and to allow a rational auditor to choose from a very wide range of strategies, resulting in an audit 
of the type depicted in figure 3. This shows that a high liability ratio when unintentional errors 
are very rare can create very severe strategic dependence. 

This result provides an interesting juxtaposition to the traditional comparative statics result 
that an increase in the liability ratio causes a decrease in the Nash equilibrium rate of audit risk. 
Proposition 3.3 shows that when the rate of unintentional error is low, such an increase can also 
make it less likely that the Nash equilibrium will be attained, harming the ex post optimality of 
the audit, and allowing the level of audit risk actually attained to change in unpredictable ways. 
This result suggests a rationale for instituting higher liability ratios when internal control systems 
are weak or true balances are likely to be low, while not penalizing auditors so severely for failing 
to detect intentional misrepresentations. This is consistent with the audit profession’s position in 
the ongoing debate over the “expectations gap” regarding auditors’ liability in the presence of 
fraud (O'Reilly et al. 1990). 


Audit Technology 


The sample data collected by the auditor provide no direct evidence on the manager's strategy 
(they provide evidence only on the true account balance); however, more precise data can reduce 
strategic dependence, and thus make the auditor's risk assessment more accurate. Conversely, 
extremely imprecise data can cause strategic dependence to be so high that the auditor can 
rationally expect (and the manager can rationally choose) any strategy, making it difficult for the 
auditor to choose an optimal audit strategy. 


Proposition 3.4. As the auditor's data becomes perfectly precise, strategic dependence 
vanishes (as 0 — 0, T, — t* for n 2 1). As the auditor’s data become perfectly 
uninformative, strategic dependence can approach 1 (as 0 œ, either T. — t*- 1, 
T, — t*= 0, or T, > T, forn2 1). 


The auditor's decision to accept or reject depends on both the auditor's expectation of the 
manager's strategy and the realization of the audit data z. As the audit data become extremely 
precise, the auditor’s decision is determined almost entirely by the realization of that data and is 
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almost unaffected by the auditor’s risk assessment. The slope of the auditor’s best response 
function therefore approaches zero (O(t) approaches O for all t), so that the manager faces a 
straightforward one-person decision problem. 

As the audit data become extremely imprecise, the auditor’s decision is determined almost 
entirely by the auditor's expectation, and is almost unaffected by the data. In extreme 
parameterizations, the auditor may prefer to accept for every expectation or reject for every 
expectation. However, more realistic parameterizations would leave the auditor preferring to 
accept always for some expectations and reject always for others, so that 6(0) — land 
&(1) 0. Because this will cause D, to surround T' (as in figure 3), the auditor can expect (and 
the manager can choose) any level of t. 

This result suggests that auditors may be less likely to choose the optimal level of detection 
risk when the audit balance is not easily verified, as in the audit of a subjective balance or 
management estimate. 


Manager Incentives 


The value of v (which determines the variation in V ,(t) over t) affects strategic dependence 
in much the same way as the precision of the auditor's data. If V | (t) varies widely over different 
values of t, then the manager's decision is determined almost entirely by the realization of t, and 
is affected little by the manager's expectation of the auditor's action. Strategic dependence will 
be low because the manager's probability of misrepresentation 1s determined primarily by an 
exogenous variable (t), and not by the manager's expectation of the auditor's strategy. 

If V ,(t) varies little over different values of t, then the manager’s decision is determined 
almost entirely by the expected level of detection risk, and is affected little by the realization of 
t. As long as the manager prefers to represent honestly for some expectations and represent 
dishonestly for others, strategic dependence will be very high, because a slight change in the 
auditor's expectation can cause a dramatic change in the manager's optimal response. 

The following proposition establishes that when variation in the manager's incentives is 
sufficiently small, accurate risk assessment may be impossible. (The case when variation is great 
is complicated by the restriction V,(1) > V,.) 


Proposition 3.5. As the variation in the manager's incentive to misrepresent becomes very 
small, strategic dependence can approach 1 (as v —0, either T, >t* = 1, T, >t* =0, 
or T. — T, forn 2 1). 


Despite their mathematical similarities, the interpretations of propositions 3.4 and 3.5 are 
quite different. Variations in manager incentives reflect an inverse measure of the auditor's 
information about the manager's incentives—low variation indicates that the auditor knows the 
manager's incentive with great accuracy. Proposition 3.5 therefore shows that an auditor with a 
more accurate estimate of the manager's incentive may have more difficulty predicting the 
manager's strategy. (Of course, this assumes that the manager's incentive is not extremely low 
or extremely high; in either of these cases, the manager always reports honestly or always 
misrepresents, so that strategic dependence is eliminated.) This counterintuitive result may shed 
some light on the effect of SAS 53's requirement that auditors collect additional data (an action 
which reduces strategic dependence) when the manager is known to have an incentive to 
misrepresent intentionally. 


Bounds on Detection Risk 


The auditor's payoff is usually assumed to depend only on whether the representation the 
auditor accepted or rejected was true or false. However, regulations can also penalize auditors for 
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adopting certain strategies regardless of the outcome. In particular, SAS 47 effectively prohibits 
extremely high levels of detection risk, even if such a level would be optimal given the auditor’s 
assessment of inherent risk: 


As the auditor’s assessment of inherent and control risk decreases, the detection risk 
that he can accept increases. It is not appropriate, however, for an auditor to rely 
completely on his assessment of inherent risk and control risk to the exclusion of 
performing substantive tests of account balances.... (SAS 47, Sec. 25) 


In an analysis which assumes equilibrium behavior, such a standard would affect the 
auditor’s behavior only if it ruled out the equilibrium strategy. (Otherwise it would only rule out 
strategies which the auditor would not choose anyway.) However, a maximum bound on 
detection risk can reduce strategic dependence by requiring the auditor to choose the same 
strategy (the highest allowable level of detection risk) for a wide range of expectations, effectively 
truncating D, As aresult, SAS 47 and similar standards can improve the auditor’ s ability to assess 
inherent risk accurately. 


Proposition 3.6. Assume that it is common knowledge that the auditor chooses a value of 5 
strictly less than x. Then for 6* < k< Min{8(0),t-1(1)}, a decrease in x causes a decrease 
in IT | for all finite n 2 1 


Mandating an upper bound on detection risk indirectly results in a lower bound on detection 
risk, which in turn ensures that misrepresentation rates will not be below some lower bound. In 
this way, standards improve the accuracy of the auditor’s inherent risk assessment by eliminating 
both extremely high and extremely low levels of inherent risk. This effect is illustrated in figure 
5. Without any restrictions, the auditor could rationally choose a wide range of strategies. 
However, the restriction of 5 to a value less than or equal to x reduces the upper bound of Dp which 
reduces the upper bound of T,, which increases the lower bound of D,, and so on, as shown by 
the dashed line. Thus, for any finite number of iterations of rationalization, a lower maximum x 
allows a more accurate risk assessment. Decreasing K causes the dashed rectangle to contract, 
implying that the T. contract as well. 

It is interesting to note that rationalization can rule out the original maximum level of 
detection risk x if the slopes of the best response functions are small enough to make B*(K) < x. 
This means that the auditor could reduce strategic dependence by establishing a credible and self- 
fulfilling reputation for not choosing levels of detection risk above K. However, such a reputation 
seems unlikely to eliminate the need for mandated standards on detection risk, for two reasons. 
First, for games in which strategic dependence is otherwise severe (so that an upper bound on 
detection risk is most valuable), a voluntary statement will typically not be credible. Second, the 
manager will expect a level of detection risk less than K only if the manager believes that the 
auditor believes that the manager believes that the auditor believes that the manager believes that 
the auditor will not choose a high level of detection risk. Such "higher order" beliefs may be 
difficult to obtain unless the reputation is backed by enforcement and penalty. Auditing standards 
and audit firm policies can provide these sources of credibility. 


V. CONCLUSION 


An accurate assessment of inherent risk is crucial to an efficient and effective audit. Previous 
models of the audit setting simply assume that the auditor is able to predict inherent risk 
accurately. While such an assumption seems reasonable in a decision theoretic setting in which 
inherent risk is exogenous, it is more problematic in a strategic setting in which inherent risk is 
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FIGURE 5 
The effect of an upper bound on detection risk. 
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If the auditor is prohibited from choosing a level of detection risk greater than x, the manager 
will not choose a misrepresentation rate greater than T(x), so that the auditor will not choose a level 
of detection risk below &(t(K)), as shown by the lower boundary of the dashed rectangle. Lowering 
K causes the rectangle to contract, restricting the range of strategies the players can choose. 


chosen privately by a manager. This paper uses an explicit model of how the players form 
expectations to identify audit characteristics which can influence the ability of the auditor to 
predict a manager’s strategic action. This ability is shown to be negatively related to the degree 
of “strategic dependence” (the product of the average slopes of the players’ best response 
functions), which reflects the sensitivity of the manager’s optimal response to variations in the 
auditor’s expectation. 

The analysis shows that high rates of unintentional errors, wide variations in manager’s 
possible payoffs, precise audit technology and strict standards on the auditor’s maximum 
allowable choice of detection risk tend to decrease strategic dependence, and therefore improve 
the accuracy of the auditor’s risk assessment. The effects of the auditor’s liability ratio (the 
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penalty for incorrect acceptance divided by the penalty for incorrect rejection) depends on the risk 
of unintentional errors. When the risk of unintentional errors is high, increasing the liability ratio 
typically decreases strategic dependence; when the risk of unintentional errors is low, increasing 
the liability ratio typically increases strategic dependence. 

These results contribute to the analytical auditing literature in two ways. First, they provide 
insight into ways in which audit firms and regulators might improve the efficiency and 
effectiveness of audits, through policies on data collection, auditor liability, and allowable 
choices of detection risk. Second, they suggest that the traditional Nash equilibrium analysis of 
strategic audit settings may have limited predictive power. Because Nash equilibrium outcomes 
may arise only when strategic dependence is low, the behavior of auditors and managers may not 
always change in the ways predicted by traditional (comparative statics) Nash equilibrium 
analysis. The results also provide a natural opportunity for experimental tests of whether the audit 
characteristics explored in the paper actually affect the accuracy of the auditor’s inherent risk 
assessment, and the predictive power of the Nash equilibrium outcome. 


APPENDIX 


Proof of Proposition 1.1 
To compute (T), let 


; [n * (1 — 10]I(1 - p) 
P(t) = Pr(0,185) = — —— T 
M= POLIT a - m -p +P 


be the inherent risk that a balance represented as high is truly low, and let 
e(8) = Pr(z « c106,94) (2) 
be the probability of incorrect rejection associated with detection risk 6. The following Lemma 
provides some facts aboute(6), which will be used later in the analysis. The proof is available upon 
request. 
2c-n) 
Lemmal: —€'(5)- » a | < 0 and e"(8) > 0. 


The auditor's utility function is written 


U(1,8) = PÆL (a 16,,0,)+(-8)(a16,, e,)! 


HI- Pee®U(r16,,0,,)+1-e8)(a 16,0) (3) 
which yields the first order condition 
; PP 
Ò DE pR AEO OAT e, 0. 
e9) [1 - P(t) ]|a m 


The second derivative of (3) with respect to ô is simply [1 - P(t)] €“(5) a « 0, so the value of 8 
satisfying (4) is indeed an optimum. To compute the slope of 5(t), apply the envelope theorem: ` 
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9?U(1,6 , 
sw e ain POB- » 
g U(t, [1 - P()]e^(6)a 
9?6 
Proof of Proposition 1.2 
First, compute the manager's utility as 
t 
V(4,5) = [V,(06--(1— 8)Vadt + (1— V; (6) 
t=0 ` 
The first derivative of (6) becomes 
Va (S + (1—-8)Vp — Vi. (7) 
Substituting in V, (t) = w - v t yields a first order condition 
1(5) = 1 |v a Conan] (8) 
y 


The second derivative of (6) is -v 5< 0, so the value of t satisfying (8) is an optimum. The slope 
of 1(0) is given by the envelope theorem: 


IZV (T, 5) y 
1'(8) =— got = ae »0. (9) 


Q^ V(I, Sic vd 


Proof of Proposition 3.1 


Because &(t) — 0 as R — 1, there exists a value R’ such that if R > R^, then &1) «c!(0). 
If &0) < t'(0), then T, = t* = 0 for R > R^, and R’ satisfies the requirements of the proposition. 
Otherwise, T, = [0, B(6(0))]. In this case, an increase in R causes a decrease in 5(0) and therefore 
a decrease in B(&0)), so that IT | decreases. For higher n, note that D, = [B*(9(1)), B7(5(0))], so 
that T, 2 [0, B(5(0))] =T. A simple induction argument establishes that T, =T, so that R" satisfies 
the requirements of the proposition. 


Proof of Proposition 3.2- : 
The proof is analogous to the proof of Proposition 3.1, and is omitted. 


Proof of Proposition 3.3 


As R —0, à(0) — 0 for all finite 1; therefore, à(1) > t" (1), so that either T, = t* = 1, or T, = 
[1(6(1)), 1]. In the latter case, an increase in À causes a decrease in t(9(1)), and therefore increases 
IT, |. A simple induction argument establishes that an increase in IT, causes increases in IT. 1, for 
n 2 1. Thus, all À satisfy part (a) of Proposition 3.2. 
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Proof of Proposition 3.4 
Rewrite the auditor’s first order condition 


east -QATA z0 (10) 


where Q(t) = P(T)/[1 - P(1)]. Because the second term of the equation is finite, c must converge 
to 1/2 as © goes to 0, so that 6(1) —0 and &0) drops below t^'(0). As © goes to infinity, the sign 
of the derivative approaches the sign of 1 - Q(t) A for all values of 8. If Q(0) > 1/A or Q(1) < 1/ 
À, then the sign is always positive or always negative, so that the auditor chooses a boundary 
strategy ô (1)=0 or S(t) = 1, and T, = t*. Otherwise, the continuity of Q(t) guarantees that there 
is a value t* for which 6(t) = 1 for t <t* and &(t) =0 for t» t*. As a result, T, = T, =T. Induction 
shows that T, = T, forn 2 1. 


Proof of Proposition 3.5 
The proof is analogous to that of Proposition 3.4, and is omitted. 


Proof of Proposition 3.6 


Assume that k< Min{8(0), t !(1)). Induction establishes that for m = 4k+1, B™(k) represents an 
upper bound of some set T_, while for m = 4k+3, B™(k) represents the lower bound. To find how 
those boundaries change as K changes, simply use the chain rule: 


dB" dB"(X) _ at oB/ (K) (1 1) 

ak j=l 9Bi-! (K) 
The partial derivatives alternate between 1'(8) » 0 and 6'(t) <0. The product of any 2 consecutive 
derivatives is negative, while the product of any 4 consecutive derivatives is positive. The last 
derivative is always 1'(K), which is positive. Therefore, when B"(K) is an upper bound (and m = 
4k+1), the derivative is positive, and a decrease in K causes T, to shrink. When B"(K) is a lower 
bound (and m = 4k+3), the derivative is negative, and again a decrease in Kcauses T, to shrink. 
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L INTRODUCTION 


he notion that firms change accounting methods to manipulate their reported earnings has 

been examined extensively in the accounting literature! While the practice of using 

write-downs has spread rapidly in recent years, the existing literature has treated 
accounting changes and asset write-downs as two independent issues. In this paper, we analyze 
these two issues as a joint decision in financial reporting. 

Specifically, we investigate the clustering of accounting changes and write-downs in the oil 
and gas industry during the 1985-1986 period. In these two years, the drastic decline in oil prices 
coincided with a rush of accounting changes from the full cost (FC) method to the successful 
efforts (SE) method. Those companies that retained the FC method reported large asset write- 
downs. Both the accounting changes and the asset write-downs triggered controversies in the 
accounting profession.? 

Many studies indicate that earnings-based bonus plans are a popular means of rewarding 
corporate executives,* and that executives have an incentive to manipulate accounting numbers 
in order to maximize their bonus payout. This paper compares the firms that switched their oil and 
gas accounting method from FC to SE (hereafter called “switch firms”) and the FC firms that took 
the write-down decision (hereafter, “write-down firms"). In a time-series analysis of bonus data, 
we find that the bonuses for the executives of switch firms are associated with the firms’ 
accounting income, suggestive of the effects of bonus plans on the switch decision. 

On the other hand, we also show that the degree of the income/bonus association for the write- 
down firms is at least as high as that for the switch firms. We report several findings to explain 
the former group’s decision to take a write-down. First, relative to the switch firms, the write- 
down sample had more firms reporting a loss (before write-down) during the accounting decision 
year. This finding is consistent with the “lower-bound hypothesis” posited by Healy (1985), i.e., 
when accounting income is below the lower bound, managers might reduce the income further 
by taking write-downs since no bonus will be paid. Second, managers of write-down firms which 
were profitable (before the write-down) were shielded from the effect of the write-down in their 
bonuses in the decision year. This finding is consistent with Dechow et al. (1994), who posit that 
compensation committees protect managers from the bonus effects of non-performance-related 
measures such as write-downs. Finally, the write-down firms were more intensive in oil and gas 
exploration and more concentrated in oil and gas production than the switch firms. This indicates 
that the firms switching to SE possess the same characteristics of the firms adopting the SE method 
as reported by Malmquist (1990). These characteristics make the switch firms’ income less 
susceptible to the effects of unsuccessful exploration costs. As a result, managers could find it 
more convenient to change to the SE method than to request the intervention of the compensation 
committee to adjust the bonus formula. Conversely, the write-down firms, fearful of the larger 
effects of unsuccessful exploration costs on future income, could be unwilling to switch to the SE 
method and hence depend on the intervention by the compensation committees. 


1 See Watts and Zimmerman (1986, 1990). 

2 As an example, see Elliott and Shaw (1988). This issue has also caught the attention of the Financial Accounting 
Standards Board (FASB), which is considering a new guideline for the disclosure of asset impairment (1993). 

3 For example, in April and May 1986, the Oil and Gas Journal (1986a, 1986b) and The Wall Street Journal (1986a, 
1986b, 1986c) reported oil and gas firms' failed efforts to ask SEC to delay the write-downs. In October of the same 
year, the Oil and Gas Journal (1986c) and The Wall Street Journal (1986d, 1986€) reported the SEC's failure to 
eliminate the FC method. MU 

* For example, see Antle and Smith (1986), Murphy (1985), and Lambert and Larcker (1987). 
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The remainder of the paper is organized as follows. Section II describes the collapse of oil 
prices in 1986 and the related accounting changes and asset write-downs it triggered. Section III 
discusses the sampling procedures and compares the characteristics of switch and write-down 
firms. SectionIV suggests three hypotheses to explain the accounting decisions. Section V reports 
the empirical results, and the last section summarizes the findings and discusses their implica- 
tions. 


II. OIL PRICES, CEILING TEST WRITE-DOWNS, AND 
ACCOUNTING CHANGES 


Oil Prices and the “Cost Center Ceiling Test" 


Oil prices suffered a precipitous decline in the first quarter of 1986. The spot price of the 
benchmark West Texas Intermediate crude oil tumbled from $26.30 per barrel on December 31, 
1985 to $10.40 on March 31, 1986. This drop in oil prices brought into effect the ceiling test write- 
down requirement, as stated in the Securities and Exchange Commission's (SEC) Regulation 
S-X. 

Regulation S-X 4—10(i)(4) requires the oil and gas companies that adopt the FC method to 
perform a “cost center ceiling test" in every reporting period. This requirement does not apply 
to the SE firms, although they are encouraged to observe it. The ceiling for each cost center (1.e., 
for each country, according to ASR 258 (SEC 1978)) is determined largely by the present value 
of the future net revenue from oil and gas reserves, based on the oil prices at the end of each 
. quarter? In each quarter, a write-down is necessary if the ceiling falls below the cost center's 
capitalized exploration expenditures. According to APB No. 30 (AICPA 1973), this write-down 
should be treated as an ordinary loss, and is therefore included in the income from continuing 
operations. Moreover, just like inventory write-down under the lower of cost or market (LCM) 
rule, once the capitalized costs are written down, no recapture is allowed even if oil prices rise in 
the future. 


Clustering of Write-downs and Accounting changes in 1985 and 1986 


Figure 1 puts the impact of oil price behavior on the ceiling test write-downs and accounting 
changes in a historical perspective. Panel A shows the year-end crude oil prices from 1980 to 
1986. After peaking in 1981, the prices declined gradually, and collapsed in 1986. Panel B of 
figure 1 shows the percentage of the 106 U.S. firms using the FC method that took a ceiling test 
write-down in each year (right scale). In 1986, about 60 percent of the FC firms had to take the 
ceiling test write-down, an increase from 35 percent in 1985. The percentage of the firms taking 
write-downs during 1980 through 1984 was much smaller; this difference can be attributed to the 
higher oil prices in those earlier years. 

The behavior of oil prices also affected the nature of the accounting changes. Panel B of figure 
] (the vertical bars, measured against the scale on the left-hand side) shows that before the drastic 
oil price declines in 1985 and 1986, most accounting changes were from the SE method to the FC 
method. However, during the episode of the oil price collapse in 1986, most accounting changes 
were from FC to SE. 


* More precisely, the ceiling is defined by the SEC as the sum of: 1) the present value of the future net revenue from the 
estimated production of proved oil and gas reserves, based on the oil prices at the end of each quarter; 2) the cost of 
properties not being amortized; and 3) the lower of either the cost or the estimated fair value of unproved properties less 
4) the income tax effects related to the differences between the book and the tax basis of the properties involved. 
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FIGURE 1 
The Time-series of the Refiner Acquisition Cost of Crude Oil, the Percentage of Firms 
Taking Write-downs, and the Number of Firms Changing Accounting Methods: 
1980 to 1986 


A. Refiner Acquisition Cost of Crude Oil (dollars per barrel) 





1980 1981 1982 1983 1984 1985 1986 


B. The Percentage of Firms Taking Write-downs (right scale) and the Number of Firms Changing Oil and 
Gas Accounting Methods (left scale): 
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Data sources: The refiner aquisition cost of crude oil (composite) is taken from U.S. Energy Information Administration 
reports. The percentage of FC firms taking write-downs is the percentage of the 106 U.S. firms in the Arthur Andersen 
Survey (1987) which took a write-down in each year, as identified from financial statements. The number of firms 
changing to FC and number of firms changing to SE are taken from Arthur Andersen Survey (p. S-17, 1987). 
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Thus, when oil prices declined drastically in the 1985—86 period, firms using the FC method 
faced a decision: take a write-down or switch to the SE method in order to circumvent the 
regulation. These two distinct choices offer a rare opportunity to study the trade-off between asset 
write-down and change of accounting principle. 


III. ANALYSIS OF THE TRADE-OFF BETWEEN WRITE-DOWN 
AND FC-TO-SE SWITCH 


Sample Selection 


To examine the trade-off between asset write-down and accounting change, we identified a 
sample of switch firms and a sample of write-down firms during the 1985—86 period. Oil prices 
dropped from above $30 in September 1985 to about $20 in December of that year, and continued 
to decline to as low as $10 in March of 1986. So before firms issued their 1985 financial reports, 
they already knew whether a write-down was necessary. According to the SEC (see Chen 1991), 
write-downs could be taken in either 1985 or 1986. If managers wanted to circumvent the write- 
down, they could switch to the SE method in either the 1985 or 1986 fiscal year. 

The sample selection procedures are summarized in table 1. A total of 21 switch firms were 
identified; 19 from the Arthur Andersen Survey (1987)* and two from NAARS Data Base.’ 
However, the requirement that firms have at least ten years of bonus data (from 1977 to 1991) and 
at least five years of positive income from continuing operations reduced the sample size to 12 
for the time-series bonus regressions to be discussed below.® 

As to the write-down sample, we started with 106 U.S. corporations using FC method 
identified from the Arthur Andersen Survey. After eliminating 39 which did not take a write- 
down, six that went bankrupt in 1986, and one that changed its fiscal year in the same year, we 
obtained 60 write-down firms. Since 34 of the 60 firms did not have at least ten years of bonus 
data from 1977 to 1991, and another four firms did not have at least five years of positive income, 
we are left with 22 write-down firms for the time-series bonus regressions.? 


Characteristics of the Switch and Write-down Firms 


Malmquist (1990) shows that firms choosing the SE method tend to be less leveraged 
financially, larger, and devote fewer of their resources to drilling and exploration. In table 2, - 
we compare these characteristics for the two groups of sample firms: leverage (LEVRG, debt to 
total assets), firm size (SIZE, the natural logarithm of sales), exploration intensity (EXPI, the 
exploration expenditure divided by oil and gas sales), and degree of concentration (CNCENTR, 
concentration in oil and gas production, defined as oil and gas sales divided by total sales). 


$ By most measures, the Arthur Andersen Survey covers more than three-fourths of the public companies in the U.S. oil 
and gas industry (Arthur Andersen, 1987, S-1). 

7 In NAARS, we searched for key words such as "switch" or “change,” together with “oil and gas accounting method.” 

* In the time-series bonus regression presented later, the logarithm of the income is used as an independent variable. 
Following the procedure of Healy et al. (1987), the logarithm of negative income is defined as zero. If a firm has losses 
in many years, it would have many zero observations in its independent variable. Thus, the firms used in the time-series 
test are limited to those with at least five years of positive income. 

? There are two reasons why so many firms lack sufficient bonus data, First, the oil and gas industry underwent an 
extensive restructuring in the 1980s. Many firms have either gone bankrupt or merged (Chen and Lee 1993). Second, 
some of the proxy statements are missing, unavailable from our data sources, which include the Q-File and the SEC's 
Public Reference Room in Washington, D.C. 

9 Malmquist (1990) examines the firms’ choices of oil and gas accounting methods, rather than switches between 
methods. Another study, by Johnson and Ramanan (1988), investigates the accounting change from the SE to the FC 
method. 
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TABLE 1 
Summary of Sample Selection Process 


I. Selection of FC-to-SE switch sample 


FC-to-SE switch firms from the Arthur Andersen Survey 19 
FC-to-SE switch firms from the NAARS Data Base 2 
Total switch firms | 21 
Less: Firms without at least 10 years of bonus data 8 
Firms without at least five years 
of positive income m -9 
Switch sample for time-series bonus regressions 12 


2. Selection of write-down sample 


Non-switch U.S. FC firms in the Arthur Andersen Survey 106 
FC firms that took no write-down in 1985 or 1986 (39) 
Non-switch U.S. FC firms that took a write-down 
in 1985 or 1986 67 

Firms in bankruptcy in 1986 6 
Firms changed fiscal year in 1986 l Eri 
Total write-down firms 60 
Less: Firms without at least 10 years of bonus data* 34 

Firms without at least five years 

of positive income 4 Gg) 
Write-down firms for time-series bonus regressions 22 

* Between 1977 and 1991. 


According to Malmquist, financial leverage is a factor in the FC/SE choice because using the 
SE method could result in a lower equity and a higher debt-equity ratio, creating difficulties in 
securities underwriting and debt covenants compliance. As will be discussed later, the FC-to-SE 
switch reduced the common equity in many cases. Financial leverage may not be an important 
factor in the switch/write-down decision. This is confirmed by table 2, which shows that LEVRG 
is not significantly different between the two groups. 

On the other hand, table 2 shows that the switch firms are on average larger in SIZE and more 
diversified (with a lower measure of CNCENTR ) than are the write-down firms. The former : 
group's average measure of exploration intensiveness (EXPT) is lower than that of the latter, 
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TABLE 2 
Attributes of Sample Firms 
Mean of Mean of t-statistic 
switch write-down for mean Wilcoxon's 
Attributes* firms? firms? difference Z-statistic 
LEVRG 282 295 0.303 0.178 
(.165) (.242) 
EXPI 314 434 -1.646** -1.353 
(.240) (.389) 
- CNCENTR .394 .664 -2,538*** -2.351*** 
(.422) (.419) 
SIZE 5.180 3.411 2.780*** 2.441*** 
(2.724) (2.432) 
CHEXP -0.437 -0.355 -0.677 0.267 
(0.331) (0.711) 


*Variable definition (measured at the year prior to write-down or switch): 

LEVRG = long-term debt divided by the book value of total assets. 

SIZE = natural logarithm of total sales. 

EXPI = exploration intensity; total costs incurred in acquisition, exploration, and development divided by the revenue 
from oil-and-gas-producing activities. 

CNCENTR = business concentration; revenue from oil-and-gas-producing activities divided by total sales. 

CHEXP - percentage change in acquisition, exploration, and development costs of the switch/write-down year over the 
previous year. 
"Sample sizes: 21 switch firms and 60 write-down firms, numbers in parentheses are standard deviations. 

""Statistical significance at the 0.01 level (one-tailed test). 

"Statistical significance at the 0.05 level (one-tailed test). 


although the difference is not significant according to the nonparametric Wilcoxon's test. 
According to Malmquist, larger firms have larger portfolios of drilling projects to diversify away 
the drilling risk that is immediately reflected in the reported earnings under the SE method. In 
addition, if a firm is less concentrated in oil and gas activities, its income's exposure to the drilling 
risk is reduced. The same can be argued for a firm spending less of its oil and gas revenue in 
exploration activities. Thus, the switch firms' operating characteristics are consistent with 
Malmquist's profile of firms which are more likely to choose the SE method. 

In table 2, we also compare the change in exploration expenditure (CHEXP, defined as the 
ratio of change in "acquisition, exploration, and development costs" of the switch year over the 
previous year) of the two groups. It is possible that the firms reducing exploration activities might 
opt for changing to the SE method, since as they spend less the effects of unsuccessful costs on 
income would be less. According to table 2, on average, both groups reduced exploration 
activities after the accounting decision year, but the change in the exploration expenditure is not 
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significantly different between the two groups.!! Thus, reduction in exploration expenditure itself 
does not explain the switch/write-down decisions. 


Reporting Consequences of the FC-to-SE Switch 


APB No. 20 (AICPA, 1971) defines the FC-to-SE switch as a retroactive type of accounting 
change. That is, all the comparative statements reported in the switch year (either 1985 or 1986; 
see column (1) of table 3) should be retroactively adjusted to the SE basis. Table 3 shows the 
effects of the switch on that year’s income from continuing operations (columns (2) and (3)), 
common equity (column (5)), and on the income in the year following the switch (column (7)).?? 

For the switch year (time t), the difference between the SE income (INCOME!) and the FC 
income (INCOME), ignoring taxes and omitting time subscript, can be expressed as? 


INCOME* - INCOME** = (DEP*^ - DEP?) - DHC, 


where DEP'* and DEP™ are the depletion expenses under the FC and the SE method respectively; 
DHC is the dry hole cost that is expensed under the SE method but capitalized under the FC 
method. The differences between INCOME? and INCOME, including tax effects at the 46 
percent rate or based on after-tax figures reported in financial statements, are shown in column 
(2) of table 3 for each firm. Of the 17 firms with available data, ten have a negative difference (i.e., 
their income numbers are lower under the SE method than under the FC method). For these firms, 
(DEP** - DEP**) is less than DHC in the switch year. 

A more important effect of the FC-to-SE switch on the switch year’s income is the avoidance 
of a write-down (which amounts to CC * - CEILING, the difference between the carrying cost 
of oil and gas properties under FC and the cost center ceiling) which would have been necessary 
under the FC method. Thus, after considering tbe write-down (WD), the difference between 
INCOME? and INCOME*® can be written as 


INCOME: - INCOME" = (DEP*C - DEP) - DHC + WD. (1) 


Expression (1) could be positive (i.e., the income could be larger under SE than under FC) if WD 
is sufficiently large. 

The difference between the two income numbers defined in (1) for each firm is shown in 
column (3) of table 3. Among the 14 firms with data available, every firm reported a higher income 
number than what would have been reported under the FC method. In addition, the income 
reported under the newly adopted SE method presented in column (4) indicates that five firms 
(denoted by *) were able to turn a loss into a profit by switching to the SE method. 

The switch also affects the year-end book value of common equity beyond the net income 
change already described. The difference between the equity under the SE method and the equity 
under the FC method is CC,- CEILING, This difference could be either positive or negative.!* 


11 In addition, we also compared the exploration expenditure in the two years before and after the accounting change. The 
result is about the same: the switch firms’ reduction in exploration expenditure was not higher than that of the write- 
down firms. 

2 Since not every firm reported the detailed consequences of the switch on net income and common equity, many of the 
"as-if" numbers based on the FC method have to be estimated (hence the numbers reported in column (2)). The detailed 
estimation methods are available from the authors. 

P'The accounting effects reported in table 3 are on an after-tax basis (assuming a marginal tax rate of 46 percent). 

This is due to the fact that, under the FC method, the carrying cost should be written down to the ceiling if the latter is 
lower. A switch to the SE method will allow the firm to report the carrying cost at CC. 
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If CC® is higher than CEILING, then the switch to SE will increase the common equity, and vice 
versa. Of the 14 firms with available information, the switch increased the common equity for 
seven of the firms (see column (5)). However, with the exception of American Exploration and 
Tenneco, the increase is small (Jess than 10 percent) when compared with the reported common 
equity. 

After the switch year, the signs of the three components on the right-hand side of equation 
(1) are as follows: WD will be zero, since oil prices did not continue to decline after 1985-86; both 
(DEP*C- DEP) and DHC are positive. Therefore, the sign of (INCOME® - INCOME*C) after the 
switch year can be either positive or negative. If the exploration activities were significantly 
reduced (i.e., the DHC is low), the SE income could be higher than the FC income (i.e., (1) could 
be positive). The effects on the income during the year after the switch are estimated and reported 
in column (7) of table 3. It is seen that ten of the 12 firms with sufficient data reported a higher 
SE income in the year after the switch. 

The reporting consequences of the FC-to-SE switch delineated above can be summarized as 
follows: 


1) The FC-to-SE switch in 1985—86 was likely to improve the current accounting income, since 
the ceiling test write-down could be circumvented (see column (3) of table 3). 

2) The effects of the FC-to-SE switch on the 1985—86 values of total assets and common equity 
could be either positive or negative (see column (5) of table 3), since SE capitalizes fewer costs 
but required no ceiling-test write-down. 

3) The effects of the FC-to-SE switch on the income numbers after the switch year could be either 
positive or negative (see column (7) of table 3), depending upon whether exploration increased 
or decreased. 


Reporting Consequences of the Write-down Decision 


The write-down firms in our sample had to take a ceiling test write-down in 1985—86.!5 The 
timing and magnitude of the write-downs taken by the 60 FC firms are reported in panel A of table 
4. Since many firms took a ceiling test write-down in both 1985 and 1986, we report the amounts 
of the write-downs in these two years as WD85 and WD86 respectively. The total write-down 
amount over the 1985—86 period is reported as WDT in the third column. The materiality of the 
write-down can be measured by the ratios of WDT/IA and WDT/EQUITY, where TA and EQUITY 
are the firm's total assets and common equity, respectively. For the 11 switch firms that have 
available data, we also report the write-downs avoided as a result of the accounting change in 
panel B of table 4. 

For the write-down firms, the median amount of the total write-down (WDT) is $17.8 million. 
Based on the median statistics, the total write-down reduces total assets by approximately 12 
percent and reduces common equity by 29 percent. Thus, the write-down decision drastically 
altered the firms’ financial statements. 

For the switch firms, the median amount of the write-down avoided is $8.2 million, which 
is two percent of the total assets and five percent of the common equity.’ Panel C shows that the 
relative write-downs taken by write-down firms are significantly higher than those avoided by 


The non-switch, non-write-down firms are excluded from our sample because they were not facing the trade-off decision 
studied in this paper. 

©The write-down avoided for the switch firm is the difference between columns (2) and (3) in table 3. Only 11 firms 

reported the numbers for both columns to make possible the computation of this number. This might create a sampling 
bias. Thus, the interpretation of the comparison in table 4 should be subject to this limitation. 
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switch firms. Thus, it appears that the magnitude of an avoidable write-down is not by itself the 
reason for the FC-to-SE switch. 


IV. DEVELOPMENT OF TESTABLE HYPOTHESES 


Judging from the reporting consequences discussed above, the FC-to-SE switch increased 
the accounting income of the switch year. Since the literature has shown that executive bonuses 
are usually a function of accounting income (e.g., Lambert and Larcker 1987), management’s 
accounting decision could be influenced by this bonus/income relationship. The most obvious 
hypothesis linking bonus plans to the accounting decision is that the write-down firms’ bonus 
plans are less sensitive to accounting income than those of the switch firms. This hypothesis 
assumes a mechanically fixed bonus formula, with differences in these formulae producing 
different accounting decisions." 

Following the model used by Healy et al. (1987), we specify the bonus regression on data 
from 1977 to 1991 as follows: 


n 
In(CSB,) = $, œ; Dy + Bin INCOME, ) + €i (2) 
i=l 


where CSB, is the cash salary and bonus paid to the CEO in year t;'* INCOME, is the reported 
income from continuing operations; D, equals 1 if the individual i is the CEO of the firm in year 
t and 0 if otherwise; and n is the number of individuals who held the CEO position during the 
sample period. Following Healy et al., In(/NCOME), the natural logarithm of INCOME, is defined 
as zero when INCOME is negative. This specification implies that managers receive a bonus only 
when their company reports a profit. 

The 8-coefficient in Equation (2) is referred to as "bonus beta.” When the bonus beta is 
mechanically fixed and differs across firms, we have the following straightforward explanation 
to the write-down/switch decision (in alternative form): 


H1,, (bonus beta hypothesis): The bonus beta of the switch firm is higher than that of the 
write-down firm. 


Although rejecting H1,,,, would provide a clear association between bonus practices and the 
write-down/switch decision, failure to do so does not negate the possibility of bonus consideration 
in the accounting decision. Rather, it would suggest two other possibilities: (1) bonus plans have 
lower bounds and management intends to take a write-down in a bad year where bonuses are zero 
(Healy 1985), or (2) compensation committees intervene to protect executive bonuses from the 
impact of uncontrollable factors such as ceiling test write-downs (Dechow et al. 1994). 

A direct test of the first possibility would be to observe the bonus lower bounds specified in 
bonus plans. However, we find, consistent with Healy (1985) and Defeo et al. (1990), that many 
firms do not provide the details of bonus formulae in tbeir public filings. According to our 
examination of bonus plans, most plans merely indicate that the bonuses are determined by the 
board of directors or compensation committee. In addition, telephone conversations with firms' 
executives reveal that many firms maintain an implicit threshold or lower bound. The threshold 
level is usually not fixed, and may change from year to year. Thus, the bonus lower bound 
hypothesis has to be stated indirectly: 


17 The discussion of “mechanical contracting hypothesis" can be found in Hagerman and Zmijewski (1979) and in the 
literature cited in Watts and Zimmerman (1986). 
3 Both CSB and INCOME data are deflated by the Consumer Price Index to 1977 dollars. 
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TABLE 4 
Amounts of Write-Downs Taken by the Write-down Firms or Avoided by the Switch 
Firms in 1985 and 1986 
WDS5 WD&6 WDT 
Quartiles (MM$) (MM$) (MM$) WDT/TA WDT/EQUITY 





Panel A: Write-downs taken by the write-down firms (N = 60) 


25% 0.0 2.4 5.0 0.08 0.16 
50% 0.0 12.2 17.8 0.12 0.29 
15% -— 118 36.5 44.05 0.27 0.75 


Panel B: Write-downs avoided by the switch firms (N = 11) 


25% -— — L1 0.01 0.03 
5096 — — 8.2 0.02 0.05 
75% — — 71.2 0.25 0.25 


Panel C: Comparison of write-downs between switch and write-down firms? 


Wilcoxon’s z-statistic 5:92 88 -2.25"* 


"WD85, WD86 = Ceiling test write-down (after tax effects) taken in the 1985 and 1986 fiscal years respectively; WDT = 
WD85 + WD86, WDT/TA = WDT divided by total assets (in the absence of a write-down) in 1986; WDT/EQUITY = WDT 
divided by the common equity (in the absence of a write-down) in 1986. 

"The null hypothesis of the t-test and the Wilcoxon test is: the average write-down avoided by the switch firms is equal 
to the average write-down taken by the write-down firms (relative to total assets or equity). 

"Significant at the 0.05 level (two-tailed test). 


H2,, (bonus lower bound hypothesis): The write-down group has a higher portion of firms 
reporting a loss (before write-down) than the switch group in the accounting decision 
year. 


According to Dechow et al. (1994), compensation committees may intervene to protect 
managers from the unfavorable bonus effects of accounting charges which are beyond managerial 
control. To show the relevance of a write-down to bonus, we can use non-decision years' data to 
estimate equation (2) and then estimate the decision year's bonus, using the income before and 
after write-down as predictors. The hypothesis is stated as follows: 


H3,, (irrelevancy of write-down on bonus): The estimated executive bonus based on income 
before write-down is closer to the actual bonus than an estimate based on income after 
write-down. 
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The basis of H3 is that if compensation committees intervened to shield managers from the effects 
of the write-down on bonus, the magnitude of write-down should not affect the actual bonus 
payment. 


V. EMPIRICAL RESULTS 


Test of Bonus Beta Hypothesis (H1) 


The results of the time-series regressions are reported in table 5.? Panel A is based on all 
Observations. To control for the effects of extreme values, we identify "outliers" by using the 
DFFITS statistic of Belsley et al. (1980).? Many firms (ten of the 12 switch firms, and 16 of the 
22 write-down firms) contain these outliers (usually one observation for each firm). The outliers 
are excluded from the regressions reported in panel B of table 5. 

Several of the sample firms reported positive income from continuing operations in only a 
few years, resulting in an insufficient number of non-zero observations of In( INCOME) for the 
empirical estimation of equation (2). Thus, the firms that did not have positive income in at least 
five years (see table 1) are not included in table 5. The two sampling criteria—a firm must have 
at least ten years of bonus data and at least five years of positive income—tend to eliminate the 
smaller and less profitable firms. 

According to panel A of table 5, the bonus beta is positive for most of the fide The 
significance of the beta coefficients can be tested on an aggregate basis by the following 


Z-statistic: 
1 dol im 
AN f fk ky 72) G 


where 1, is the t-statistic for firm j associated with the estimate of fj; k is the degrees of freedom 
in firmj'sregression, and Nis the number of firms in the sample. Under the Central Limit Theorem 
and under the hypothesis of £—0, the distribution of the Z-statistic is standard normal (Anderson 
1971). 

The significance of the Z-statistics calculated from equation (3) is conditional on the 
assumption that the t-values of each firm are independent. If dependence exists, the Z-value 
should be corrected as in Christie 1990: 


Zp = ZI AL (N -1) (4) 


where p is the mean correlation among the sample firms’ ¢-statistics. 

In table 5, the Z-statistics for both groups of firms in both panels are all significantly positive. 
To examine the potential effects of cross-sectional dependence in the bonus structure, we 
calculated the pair-wise correlations of the residuals from the bonus regressions (using the pairs 
which contain at leastten years of concurrent residuals). The means forthe switch and write-down 
groups are 0.242 and 0.091, respectively. As shown in table 5, under the 0.091 correlation, the 
adjusted Z-values for the write-down groups are still significant at conventional levels. For the 
switch firms, the correlation of 0.242 will make the adjusted Z-value in panel A insignificant, but 


9? Although the time-series analysis reduces drastically the number of sample firms, it increases the number of periodical 
observations and enable inferences to be made through the aggregation of t-statistics across firms. 

? Using the criterion of Belsley et al, we define an observation to be an “outlier” if it has an absolute value of DFFITS 
higher than 2[nind/(nobs-nind)]1/2, where nind is the number of independent variables, and nobs is the number of 
observations in each time-series regression. 
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TABLE 5 
Time Series Estimation of Bonus Beta’ 
Switch firms 
Reported INCOME before 
INCOME write-down 


Panel A: Estimation of bonus beta with all observations 


Mean 0.072 0.093 
First quartile -0.010 0.011 
Median 0.016 . 0.025 
Third quartile 0.034 0.085 
Z-value for zero bonus beta? 2.088 4.442 
(significance level, one-tailed) (0.02) (0.01) 
Adjusted Z-value* 1.091 2.603 
(significance level, one-tailed) (0.14) (0.01) 


Panel B: Estimation of bonus beta after control for outliers? 


Mean 0.087 0.113 
First quartile -0.085 0.010 
Median 0.005 0.029 
Third quartile 0.034 0.111 
Z-value for zero bonus beta? 2.951 5.345 
(significance level, one-tailed) (0.01) (0.01) 
Adjusted Z-value^ __ 1.542 3.133 
(significance level, one-tailed) (0.06) (0.01) 


*The estimation of the bonus beta for each firm is based on the following model, using the time-series data of between 
10 and 15 observations. o | 
Inf CSB, ) = YaD, + Bln( INCOME, )+ Ey. 
ju] 


where CSB, = the cash salary and bonus paid to the CEO in year t; INCOME, = income from continuing operations; D, 
= 1 if the individual i is the CEO of the firm in year t, and 0 otherwise; n = the number of individuals who held the CEO 
position in the sample period. The sample consists of 12 switch and 22 write-down firms. 

"Under the null hypothesis that £ = 0, each Z-statistic is distributed as standard normal. 

*The adjustment is based on equation (4), where p is estimated by the mean pair-wise correlation of bonus regression 
residuals for each group. 

‘Outliers are those observations whose influence on the regression coefficients is relatively high, i.e., with DFFITS 
statistics higher than 2[nind/(nobs-nind)]"?, where nind is the number of independent variables, and nobs is the number 
of observations in each regression (see Belsley et al. 1980). 
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for panel B (the regressions without extreme values), the adjusted Z-value is 1.547, still signi- 
ficant at the 0.06 level. 

A comparison of the two groups indicates that the bonus betas of the switch firms are not 
significantly higher than those of the write-down firms.?! Thus, the time-series test cannot reject 
H1 wu There is no significant difference in the bonus/income relationship between the two groups 
to explain the different accounting decisions. 


Indirect Test of the Bonus Lower Bound Hypothesis (H2) 


To test H2, table 6 compares the profile of income from continuing operations in the decision 
year for the two groups. Panel A's comparison is based on the reported income, which could be 
based on either FC (after-write-down) or SE (with write-down avoided), while panel B compares 
the pre-write-down income under the FC method between the two groups. 

Panel A indicates that the write-down group has 18 firms (82 percent) reporting a loss after 
taking a write-down. As a stark comparison, the switch group has only two firms (13 percent) 
showing a loss after switching to the SE method. The Chi-square statistic of 13.607 indicates that 
the classifications of profit/loss and write-down/switch are not independent. In panel B, the 
number of losing firms in the write-down group is reduced to 12 (55 percent), so write-downs 
forced six otherwise profitable firms to report a loss. On the other hand, among the eight switch 
firms with information available to estimate the as-if FC income, only one (12 percent) shows a 
pre-write-down loss. Again, the Chi-square test shows that the classifications of profit/loss and 
write-down/switch are not independent (with a Chi-square statistic of 4.224). Consequently, the 
results are consistent with H2,,,. The much higher number of firms reporting a loss before write- 
down points to the possibility that those firms would not pay an income-based bonus, making 
avoidance of write-downs unnecessary.” 


Test of Irrelevancy of Write-down on Bonus Hypothesis (H3) 


The procedures for testing H3 are as follows. First, the parameters of the bonus equation (2) 
are estimated, excluding the write-down year’s observation. Thase parameters are then used to 
make two predictions of the cash salary and bonus (CSB) in the write-down year, using two 
income numbers (pre- and post-write-down) as the independent variable. Finally, the two 
predicted CSB amounts are compared with the actual bonus, and two prediction error measures 
are calculated. If the write-down does not affect bonus payment, the prediction error based on pre- 
write-down income should be smaller than that based on post-write-down income. 

Twelve out of the 22 firms in the write-down group already reported a loss before write-down. 
The income variable in equation (2) is defined as zero if it is negative, resulting in the same pre- 
and post-write-down income variables for the 12 firms, so our test was performed for those ten 
write-down firms that showed pre-write-down profits. As shown in table 7, the mean percentage 
prediction error based on pre-write-down income is 13.5, not significantly different from zero. 
The median is an even smaller 6.0. 


? Tf the t-values of the two groups are assumed to be independent, the difference between the bonus betas of the two groups 
can be tested by the statistic: [Z(SW)-Z(WD)]/ 42 , where Z(SW) and Z(WD) ure the adjusted Z-statistics calculated 
from equation (4), for the switch firms and the write-down firms, respectively. The values are 1.069 for panel A 
regressions, and 1.125 for panel B in table 7. Both are insignificantly different from zero. 

“In a nonparametric Wilcoxon test on the pre-write-down FC income numbers reported in panel B, the average income 
level of the switch firms is also significantly higher than that of the write-down firms at the level of 0.06. Thus, even 
if the lower bounds are above zero, the write-down group will have more firms falling below the lower bounds than the 
switch group. 


Chen and Lee—Executive Bonus Plans and Accounting Trade-offs 


107 





Third 


TABLE 6 
The Performance Comparison between the Switch and Write-Down Firms 
in the Decision Year 
Firms with Firms with First 
income » 0 | income < 0 N Quartile Median 


A. Comparison of reported income divided by total assets* 


Sample Group 

Write-down 4 18 22 -15.9% -8.1% 
(18%) (82%) 

Switch 10 2 12 0.6% 2.1% 
(83%) (17%) 


Quartile 


-1.9% 


4.6% 


Chi-square statistic testing the independence between the profit/loss classification and the write-down/ 


switch classification: 


13.607 
(p<0.00) 


B. Comparison of pre-write-down, FC based income divided by total assets? 
Sample Group 


Write-down 10 12 22 -7.9% -2.2% 
(45%) (55%) 

Switch 7 1 8 0.3% 1.6% 
(88%) (12%) 


2.3% 


3.6% 


Chi-square statistic testing the independence between the profit/loss classification and the write-down/ 


switch. classification: 


4.224 
(p<0.05) 


Reported income refers to income after write-down for the write-down firms, and income based on the newly adopted 
SE method for the switch firms. Total assets are the balance of the year prior to write-down or switch. 
“For the switch firms, the pre-write-down, as-if income based on FC method is either taken from financial statements or 


estimated using available information. 
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TABLE 7 


Comparison of the Percentage Prediction Errors of Cash Salary and Bonus, Using 
Pre- and Post- Write-Down (WD) Incomes as Independent Variable 
for Profitable Write-Down Firms* 
(Sample size = 10) 


Standard First Third 
Mean Deviation Quartile Median Quartile 


1. The difference between actual cash salary and bonus (CSB) and estimated 
CSB based on pre-WD income 


13.5 41.5 -17.4 6.0 30.8 


t-value for Ho: mean = 0 1.03 (p < 0.33) 


2. The difference between actual CSB and estimated CSB based on post-WD income l 
-35.3 29.5 -68.4 -30.0 -17.7 


t-value for Hg: mean =0 -3.78 (p < 0.01) 


"Prediction error is defined as (predicted CSB - actual CSBYactual CSB, where predicted CSB is based on the time-series 
Pap ee in(CSB,) = Y.a,D, + BInUINCOME,)+ €,. 
i-i 


In estimating the above regression, the write-down year’s observation is excluded. The resultant parameters are then used 
to predict the write-down year's CSB, using that year's income measure as the input variable. 


On the other hand, the average percentage error based on post-write-down income is -35.3. 
This indicates that, had the compensation committees used post-write-down income to determine 
the bonus payments, the average bonus would have been lower than the actual payment by 35.3 
percent, which is significantly different from zero. The median, -30.0, is also much lower than 
zero. The results in table 7 indicate that the CEOs of profitable write-down firms received normal 
bonus payouts based on pre-write-down income.? Hence, our test results are consistent with 
H3,,, i.e., compensation committees intervened to shield these executive bonuses from the 
impact of write-downs. 


Logit Analysis of the Switch/Write-Down Decision 


The test of H3 indicates that write-down firms’ compensation committees might have 
intervened to protect managers from the effects of write-down. The remaining question is: why 
the switch firms' compensation committees did not provide the same protection? The answer 


E Those firms’ pre-write-downincome numbers in the write-down year were still larger than the lower bounds. The reason 
is as follows. Had the income numbers fallen below the lower bounds, the income variable in equation (2) should have 
been zero. Using apositive pre-write-down income to estimate cash compensation should result in an average prediction 
error significantly larger than zero, which is not the case in table 7. 
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TABLE 8 


Logit Analysis of Switch/Write-down Decisions* 
Coefficient estimate (t-statistic) 


Model Model 
# Intercept EXPI SIZE | CNCENTR BBETA Ne Chi-square 
1 -0.29 -4.08 0.23 -0.32 (12,22) 8.]1** 

(-0.22) (-1.91)** — (12) (-0.15) 
2 1.91 -4.54 -1.95 0.84 (12,22) 9.49** 
(1.81)** (-2.01)** (-1.63)** — (0.39) 


*The implied dependent variable = 1 for firms switching from the FC to the SE method; = 0 for firmis that stayed with FC. 
BBETA denotes the time-series bonus betas calculated from expression (2), the other independent variables are defined 
in table 2. 

*N denotes sample size. The first number in the parentheses is the number of switch firms; the second is the number of 
write-down firms. 

“Significant at the 0.05 level, one-tailed test, except for the intercept, which is based on two-tailed test. 


might lie in the different firm characteristics presented earlier. Specifically, table 2 shows that the 
switch firms tend to be larger, less aggressive in exploration, and more diversified. To model the 
switch/write-down decisions formally, those three variables: EXPI, SIZE, and CNCENTR, 
together with the estimated bonus beta (BBETA) derived from equation (2), can be included in 
logit regressions. 

Because SIZE and CNCENTR are highly correlated (with a correlation coefficient of -0.82), 
they are used separately in the two logit regressions in table 8. EXPI is significant in both models, 
while CNCENTR is significant in model 2. The SIZE variable is not significant, but has an 
expected positive sign.” On the other hand, the bonus beta (BBETA) variable is nowhere near 
achieving a significance level.” 

The results in table 8 suggest that difference in exploration intensity and degree of 
concentration will lead managers to different strategies on bonus negotiation and accounting 
choices. The operating characteristics make the switch firms' income less subject to the effects 
of unsuccessful exploration costs. Therefore, the potential write-down might have motivated the 
switch to SE method for those firms with SE-like characteristics. The management in firms with 
pre-write-down loss would have less incentive to switch accounting methods. 


* Table 2 shows that, in the full sample, the switch firms have a significantly higher SIZE than does the write-down group. 
The insignificant SIZE variable in table 8 could be due to either: (1) the 22 write-down firms which survive the bonus 
data requirement are larger than the 60 firms in the full sample; or (2) the effects have been captured by another 
independent variable, EXPI, in the multivariate regression. 

*We also included (but not reported) two additional variables in the logit model: LEVRG, the financial leverage ratio and 
a proxy for covenant constraint (defined in section III) and a dividend restriction variable defined as common cash 
dividends divided by unrestricted retained earnings. Both variables are not significant in the logit model. Thus, both 
debt covenant and dividend considerations do not appear to be systematic factors in the write-down vs. switch decision. 
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VI. CONCLUSIONS 


Around 1986, when oil prices declined dramatically, oil and gas companies faced an 
accounting dilemma—whether to switch from the FC method to the SE method or take a ceiling 
test write-down as required by the SEC’s Regulation S-X. We compare the firms either switching 
accounting methods or taking a write-down. The major findings are: (1) as measured by pre-write- 
down income, the write-down group includes much more firms reporting a loss than the switch 
group; (2) managers of write-down firms which were profitable before any write-down were 
shielded from the effect of the write-down in their bonuses in the decision year; and (3) the switch 
firms are found to be less concentrated in oil and gas exploration and more diversified in their 
operations than the write-down firms. As to the bonus/income relationship, we find that both 
groups’ bonuses are significantly correlated with accounting income and there is no significant 
difference in the bonus structure of these two types of firms. 

This study makes several contributions to the current literature about management’s 
accounting choice. First, regarding the relationship between accounting changes and bonus plans, 
this study identifies a circumstance where management might consider changing accounting 
methods to protect bonus payout. Moreover, consistent with the results of Dechow et al. (1994), 
we also show that executive bonuses are not affected by non-performance related factors such as 
ceiling test write-downs. 

This study also has several accounting policy implications. For example, the FASB (1993) 
is considering the guidelines on accounting for asset impairment. The oil and gas property write- 
down is one form of impairment in asset value. This study suggests that managers may take 
actions to circumvent the asset write-down when doing so is in their best interests. Thus, any 
regulation regarding asset write-downs should consider the possible circumvention by manage- 
ment. In addition, the controversy over the coexistence of two accounting methods in the oil and 
gas industry is likely to be rekindled by accounting regulators in the future. This study shows that 
the availability of two accounting methods provides management with leeway for manipulating 
income numbers. Finally, the switch from FC to SE method documented in this study was an 
attractive option because the SEC imposes the ceiling test on FC firms only. If the SEC mandated 
the same requirement for SE firms as well, fewer accounting changes might have occurred. 
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ABSTRACT: We examined managements discretionary disclosures prior to a 
special, yet Important, event—a large eamings surprise. In what ways do managers 
alert Investors to the surprise, and what is Investors’ reaction to such wamings? To 
address these questions, we analyzed all managerial disclosures prior to the 
surprising earnings release. 

Less than ten percent of our large-surprise firms published quantitative eamings 
or sales forecasts, while 50 percent of the firms kept silent. Firms facing eamings 
disappointments were more likely to make a disciosure, and larger disappointments 
were preceded more often by "harder" (more quantitative and eamings-related) 
wamings. We found the likelihood of warnings to be positively associated with firm 
size, the existence of a previous forecast, and membership in a high technology 
industry. Finally, warnings tend to be issued for permanent eamings disappoint- 
ments, while transitory disappointments are more likely to occur without prior 
warming. 


Key Words: Discretionary disclosures, Earnings surprise, Warnings, Eamings per- 
sistence. 


Data Avallability: List of sample companies upon request. 


I. INTRODUCTION 


esearch on discretionary disclosure by managers has been recently extended beyond 
earnings forecasts,which were investigated in the 1970s and 1980s (e.g., Pownall and 
Waymire 1989, Lev and Penman 1990, Frankel et al. 1990, King et al. 1990). Among the 
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issues examined empirically were the reasons for the voluntary release of negative news (Skinner 
1994), the relation between management disclosure and shareholder litigation (Francis et al. , 
1993, Lev 1993), and the association between various firm attributes and their disclosure policies 
as reflected by ratings of financial analysts (Lang and Lundholm 1993). We pursue and extend 
the research on management discretionary disclosure (not restricted to earnings forecasts) by 
focusing on a special, yet important situation—the forthcoming release of a large earnings 
surprise. 

The specter of surprising investors, particularly disappointing them with large unexpected 
earnings, presents managers with a disclosure dilemma: to warn investors of the impending 
surprise prior to the earnings announcement or to keep silent. In fact, the dilemma is considerably 
more complicated than this binary choice: warnings can take various forms (e.g., a specific 
earnings forecast or a qualitative production report), and can be communicated through alterna- 
tive channels (e.g., a public announcement via the news wires or a conference call with analysts).! 
To learn how managers cope with this dilemma, we examine all types of public disclosures 
(quantitative as well as qualitative) made by managers of the sample firms prior to the earnings 
announcement and identify company and industry attributes which distinguish firms that alert 
investors to the earnings surprise from those that keep silent. We extend the analysis of 
discretionary disclosure beyond the commonly-used disclosure/no-disclosure dichotomy and 
examine the choice of the specific type of message—from “hard” earnings forecasts to “soft” 
operating or product-related releases. We then examine the capital market consequences of 
warnings, focusing on the combined reaction of investors to the warnings and the subsequent 
earnings announcements. 

We find that the frequency of quantitative earnings or sales forecasts is rather low even among 
large-surprise firms: only six percent of the positive surprise companies and nine percent of the 
negative surprise firms disclosed such information. On the other end of the disclosure spectrum, 
roughly half the sample firms did not provide any operating information prior to surprising 
investors. The remaining companies provided some qualitative information prior to the earnings 
announcement. The frequency of voluntary disclosure of negative information (warnings) is 
twice that of positive information among our large- surprise firms, a phenomenon noted recently 
on a random sample (Skinner 1994). 

With respect to the attributes distinguishing warning from no warning firms, we find that the 
likelihood of issuing a warning increases with the size of the earnings surprise (expectations gap), 
the existence of an earlier prospective disclosure made by management, a firm's membership in 
high-technology industry, and firm size. On the other hand, regulated firms warn less frequently 
than unregulated ones. These findings are consistent with various economic and legal hypotheses 
discussed below. With respect to the choice of disclosure type, we find that the larger the 
impending earnings surprise, the more quantitative (“hard”) and earnings-related is the warning. 
Thus, it appears that the form and content of the warning 1s chosen by managers to match the 
seriousness of the expectations gap. Furthermore, the “harder” the warning, the more pronounced 
is investors' reaction to it and the quicker is the closure of the expectations gap. 

Finally, turning to investors' reaction to pre-surprise disclosures, we find that for the bad 
news firms, the combined reaction to the warning and the subsequent earnings announcement is 


! The Wall Street Journal recently commented on a shift in disclosure channels used by public companies: “Such 
conference calls [to analysts] seem to be replacing public meetings as the vehicle of choice to get out bad newa—which 
then filters out, through the analysts, to selected investors. Companies like doing them because they're quicker and 
cheaper than setting up mectings. And by screening the calls, they can try to keep out analysts they don't like." (August 
25, 1993, p. C2). 
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significantly more negative for firms that warned investors than the reaction to the earnings 
announcement of the no warning firms. A further examination of this finding suggests that 
warnings tend to be issued by firms whose earnings disappointment is permanent while transitory 
earnings declines often go unwarned. Investors' reaction appears related to the permanence of the 
earnings disappointment. Nevertheless, we cannot rule out the possibility that the observed 
relatively large negative reaction to warnings is also due to investors reading into the warning 
more than managers intend. While warnings are typically limited to the forthcoming earnings 
announcement, they may raise investor concerns about long-term competitiveness and economic 
viability, contributing to large price declines. This may provide a clue to an intriguing question 
raised by our findings: if warnings are as beneficial as often claimed (e.g., decrease investors’ 
transaction costs, deter litigation), why is it that about half our sample firms did not provide any 
operating information prior to surprising investors. Perhaps concerns with an overreaction to 
warnings explain such prevalent corporate silence. 

Section II outlines the sample selection procedures and section III describes the character- 
istics and types of managerial warnings. Section IV presents the results of a multi-response logit 
analysis aimed at identifying attributes associated with the frequency and type of disclosures. 
Section V examines investor reaction to warnings, while section VI provides explanations to these 
reactions. Section VII concludes the study. 


II. SAMPLE SELECTION AND ITS CHARACTERISTICS 


We focus on managers’ disclosure policy when facing a large earnings surprise. Accordingly, 
the sample was selected by ranking the firms on the 1992 COMPUSTAT quarterly tape by their 
earnings surprise, relative to analysts' forecasts, in the fourth fiscal quarter ofthe years 1988, 1989 
and 1990.? These three years were chosen to represent a variety of economic conditions: 1988 
was characterized by significant increases in corporate earnings (e.g., the average return on equity 
of all manufacturing corporations was 16.1% in 1988 vs. 12.8% in 1987), while 1989 and 1990 
were characterized by earnings decreases (average return on equity ratios of 13.6 and 10.746, 
respectively for all manufacturing corporations)? From each year's ranking of firms by their 
earnings surprise, we selected the companies with surprises (both positive and negative) larger 
than one percent of their stock price. This yielded a total sample of 622 firms, of which 186 had 
positive earning surprises (“good news firms") and 436 had negative surprises (“bad news 
firms’’). 

Firms involved in corporate control changes (e.g., mergers, acquisitions, divestitures) around 
the earnings announcement are problematic for this study, since stock price reactions to control 
change announcements are generally very large and may swamp any reaction to the discretionary 
disclosures examined here. Furthermore, managers involved in control changes may exhibit a 
different disclosure behavior than that in “normal times.” Therefore, we eliminated from the 
sample 57 firms that had change of corporate control announcements during the examined 
interval (specified below). The final sample consists of 565 firms—of which 171 are good news 
and 394 are bad news firms—representing most of the 2-digit SIC codes on COMPUSTAT. 

The measurement of the earnings surprise, the length of the disclosure observation interval, 
and the various stock return windows can be understood using the following time line: 


2 The fourth quarter of each year was chosen because it had by far the largest number of analysts’ forecasts on the IBES 
Detail Tape. Also, previous research indicated that a proportionately large number of discretionary disclosures are 
released in the fourth quarter (e.g., McNichols 1989). 

3 Data obtained from the Economic Report of the President, January 1993, 451. 
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Analysts’ 
Forecasts Discretionary Disclosures 
E3 30 Days 2:60 days E4 


E, and E, denote the earnings announcement dates of a sample firm for the third and fourth fiscal 
quarters, respectively. We split the interval between these two earnings announcements into a 
30 calendar day interval subsequent to the E, announcement for collection of analyst forecasts of 
E, earnings, and the remaining interval up to the E, announcement (typically extending over 60 
calendar days) for collection of managers' voluntary disclosures. Analysts' fourth quarter 
earnings forecasts were obtained from the IBES Detail Tape. The 30-day interval for collecting 
analyst forecasts was specified so that our earnings surprise measure for E, will be based only on 
updated forecasts, made by analysts after observing E, Some analysts predict primary EPS 
while others predict fully diluted EPS, so we measured the E, earnings surprise (reported EPS 
before extraordinary items minus forecasted EPS) for a sample firm by first computing the 
surprise for each analyst (using either primary or diluted EPS), and then averaging the individual 
surprises to obtain an average earnings surprise for a sample firm. Each firm’s earnings surprise 
was deflated by the stock price at the end of the third quarter. 

Table 1 presents descriptive statistics for our sample. The mean earnings surprise of the good 
news firms is 2.9% of stock price (3.8% for the random walk surprise), while the mean surprise 
for the bad news firms is -7.4% (-5.6% for random walk). The median surprises are smaller than 
the means, indicating the existence of large (positive and negative) earnings surprises in the 
sample. Note that the mean, and particularly the median, earnings surprises based on analysts’ 
forecasts (AF) are close to the random walk (RW) mean and median surprises. (The correlation 
between the two surprise measures is 0.70.) This similarity suggests that our findings are not 
sensitive to the way in which earnings surprise is measured. (This conclusion is corroborated by 
the various replications of the tests using the two surprise measures, as reported below.) 

The mean and median annual sales growth rates of the good and bad news firms (over a three- 
year period ending with the earnings surprise) are close—10.6 and 12.396 per year at the mean— 
and are positive for both groups of firms, indicating that the bad news firms were not particularly 
poor performers in terms of sales. The sample firms (both good and bad news) are larger at the 
median than the 1992 COMPUSTAT NYSE and AMEX population: $421 and $185 million, 
respectively, at the end of 1990. At the mean, however, the size of the sample firms ($1,403 
million) is close to the COMPUSTAT population ($1,438). (These figures are not reported in 
table 1.) Within the sample (table 1), the bad news firms are larger at the mean than the good news 
firms, yet the median sizes of the two groups are close. The systematic risk (beta) of the bad news 


^The earnings announcement dates were those on the NEXIS file from which we collected corporate disclosures. In a few 
cases these dates were different from the COMPUSTAT dates. 

* We want to avoid contaminating the earnings surprise measure with “stale forecasts" made prior to the third quarter 
earnings announcement. Of the sample firms, 49 percent had one analyst forecast during the 30 day period; 21 percent 
had two forecasts; 12 percent had three analysts forecasts; and the remaining firms had four or more analysts’ forecasts. 
Given the relatively large number of earnings surprises based on one analyst forecast, we replicated all our tests with 
an alternative surprise measure—seasonal random walk (i.e., fourth quarter earnings minus year-earlier fourth quarter 
earnings, divided by stock price). In general, we did not find significant differences between results based on analysts’ 
forecasts and those derived from random walk surprises. | 

6 These mean price-deflated earnings surprises are fairly large. For example, a three percent negative price-deflated 
carnings surprise when considered permanent by investors, will cause a 45 pezcent stock price decline for a firm with 
a Price/Earnings multiple of 15 (assuming the multiple does not change by the earnings announcement). 
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TABLE 1 
Sample Descriptive Statistics 
Good News Firms! (N = 171) Bad News Firms! (N=394) 
Mean Std. Dev. Median Mean Std. Dev. Median 
Earnings Surprise (AF)? 0.0292 0.0308 0.0182 -0.0737 0.1386 -0.0301 
Earnings Surprise (RW)? 0.0377 0.0678 0.0184 -0.0560 0.1967 . -0.0299 
Annual Sales growth? 0.106 0.149 0.077 0.123 0.244 0.083 
Size (Market capitalization 1,146 1,753 381 1,515 4,140 436 
in $ millions) 
Beta 1.08 0.39 1.08 1.14 0.36 1.14 
Market-Adjusted Returns:* 
Long window 0.0374* 0.1582 0.0250** -0.0620* 0.1693  -0.0746* 
5-day warning 0.0310* 0.0761  0.0193** -0.0280* 0.0960 -0.0148* 
5-day earnings 0.0198* 0.0870  0.0096* -0.0142* 0.0741 -0.0158* 


! “Good News” (“Bad News”) firms are those whose fourth quarter earnings were larger (smaller) than analysts’ forecasts 
of earnings. 
* AF indicates an earnings surprise based on analysts' forecasts, while RW indicates a seasonal random walk earnings 
surprise. The earnings surprise is computed as actual minus predicted EPS, divided by beginning-of-quarter stock price. 
3 Computed over a 3-year period ending with the examined quarter. 
*"Long window’ returns are cumulated from the 31st day after the third quarter earnings announcement through the 2nd 
day after the fourth quarter announcement. “5-day warning" returns are cumulated from -2 through +2 days around the 
primary voluntary disclosure. “5-day earnings" are returns cumulated around the fourth quarter earnings announcement. 
All returns are market-adjusted returns, which are computed as the stock’s raw return over the interval minus the 
corresponding CRSP equally-weighted market return. 
*, ** = statistically significant, one-tail t-test, at the 0.01 and 0. 05 levels, respectively. 





firms (mean = 1.14) is close to that of the good news firms (mean = 1.08), and both are only slightly 
higher than the population mean beta of one. Thus, with respect to both firm size and risk, our 
sample of large-surprise NYSE and AMEX firms does not differ much from the COMPUSTAT 
population. Finally in table 1, the mean and median market-adjusted stock returns for the three 
windows examined in section V are all significantly different from zero and their sign conforms 
with that of the earnings surprise. 


III. TYPES OF DISCLOSURES 


Most studies on discretionary managerial disclosure were limited to a small subset of 
corporate communications, typically quantitative earnings forecasts. We consider all public 
disclosures made by the sample firms: from the 31st day after the announcement of third quarter 
earnings (E,) to the fourth quarter (E,) earnings announcement (see time-line above). This 
relatively narrow disclosure window (typically of 60 calendar days) was dictated by the focus of 
our study—the disclosures of managers facing a large earnings surprise. Generally, managers 
realize that they are going to surprise investors only towards the end of the quarter ás the periodic 
results are computed. Managers' disclosures were collected by retrieving from the NEXIS News 
(Wires) file all corporate announcements made through the various news wire systems and other 
means." Each firm's eee were Classified into the following eight disclosure types, grouped 
into three categories: 
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I. Earnings or sales forecasts 
1. Point estimates of fourth quarter sales or earnings. 
2. Range estimates of fourth quarter sales or earnings. 
3.Qualitative disclosures about fourth quarter sales or earnings. 
II. Other operating information 
4. Miscellaneous operating data: asset writeoffs, gains on asset sales, load factors, order 
backlog, etc 
5. Announcements of capital expenditures, new products, major contracts, joint ventures, 
etc. 
6.Shareholder payouts, mainly dividend changes and stock repurchases. 
HI. Nonoperating or no-information 
7.Nonoperating announcements, typically appointments of officers and board members, 
community services, etc. 
8.No discretionary disclosures made during the examined window. 


Disclosure types 1—3 consist of explicit forecasts (quantitative and qualitative) of fourth 
quarter sales or earnings. These are the most specific ("hard") releases concerning the forthcom- 
ing earnings announcement. Disclosure types 4—6 provide indirect or partial information about 
earnings, including announcements of components of earnings or asset writeoffs, as well as 
information on load factors and order backlogs. Monthly realized sales data, typical of retailers, 
also fall into this category. Type 5 disclosures pertain to the long-term (capital expenditures, new 
products and joint ventures), while type 6 announcements (shareholder payouts) may provide 
signals with respect to future earnings. Type 7 disclosures (e.g., new board members) appear to 
have no direct relevance to the forthcoming quarterly outcome and are therefore grouped with the 
no disclosure category (type 8). 

Some sample firms made multiple disclosures during the examined window. For each of 
these firms we identified the "primary disclosure" as the highest-ranked disclosure type in the 
above ranking (i.e., the most quantitative, earnings-related disclosure).? This primary disclosure 
was the one examined in most of the tests that follow, since it is not obvious how to aggregate and 
weigh multiple disclosures. (For example, is a type 1 disclosure more or less relevant than two 
disclosures of, say, type 4 and type 67) We thus ignore some disclosures in our tests. Table 2 
presents the distribution of the number of multiple disclosures made by the sample firms, 
classified by the "primary disclosure."? For example, of the six good news firms that released a 
point estimate of earnings or sales (top line in table 2), five firms (83.396) did not make 
announcements of any other type during the examined window, while one firm (16.7%) made an 
additional disclosure (to the primary one). No firm in this category made more than two 


? NEXIS also reports managerial disclosures made through conference calls with analysts. However, we were unable to 
ascertain the completeness of this coverage. Of course, we miss disclosures which were made privately to investors. 
However, given that itis illegal to disclose privately value-relevant information, this omission is probably of negligible 
importance. Private disclosures made to lenders or bond raters are also not included in our study. 

® For example, if a given firm made within the disclosure window the following three announcements: election of a new 
board member (type 7), a planned capital expenditure (type 5), and a point estimate of fourth quarter earnings (type 1); 
the latter, type 1 announcement is the primary disclosure. However, if only the first two disclosures were made, then 
the planned capital expenditure would be designated as the primary disclosure. 

? Table 2 data do not include multiple announcements of disclosure type 5 (e.g., capital expenditures, new products, etc.). 
These disclosures were numerous for many sample firms and it was infeasible to code all of them and analyze them 
further (c.g., determine market reaction to each one). As a practical expedient, we considered the type 5 disclosure 
closest to the earnings announcement (presumably the most updated) as the primary disclosure (if the firm did not release 
higher-ranked disclosures). 
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TABLE 2 
The Frequency of Sample Firms Making One, Two, or More Disclosures During the 
Examined Interval, Classified by Type of Primary Disclosure 


Type of Primary % With One % With % With More Than 
Disclosure N Disclosure Only Two Disclosures Two Disclosures 
Good News Firms 
1 6 83.3 16.7 0 
2 4 100.0 0 0 
3 6 50.0 33.3 16.7 
4 9 55.6 44.4 0 
9! 29 - - - 
6 20 90.0 10.0 0 
7 62 - - - 
8 35 - - - 
Total 171 
Bad News Firms 
1 14 64.3 21.4 14.3 
2 20 70.0 10.0 ' 20.0 
3 49 67.3 26.6 6.1 
4 52 80.8 15.4 3.8 
51 58 - s 
6 26 96.2 3.8 0 
7! 103 - - - 
8 12 - " " 
Total 394 


1 The frequencies in this table do not include disclosure types 5 and 7 (the latter considered by us as nondisclosure) which 
were generally numerous. 


disclosures. It is obvious from table 2 that for the majority of sample firms the primary disclosure 
was the only one made during the examined window (except for type 5 disclosures), so not much 
relevant information seems to be lost by our focus on the primary disclosure. 

"Special items," such as restructuring charges and asset writeoffs, pose a potential problem 
in our study. Corporate restructuring was prevalent during the examined period, 1988—1990, and 
firms tend to announce restructuring charges during the fourth fiscal quarter, which is the focus 
of our analysis. Therefore, it is likely that our sample includes a large number of disclosures 
relating to restructuring. Since such disclosures may be atypical (e.g., mostly implying negative 
earnings news), as may be the market reaction to them (e.g., a sometime observed positive 
reaction to a large asset writeoff), we conducted our entire analysis first on the total sample, and 
then on the subsample that excludes firms with special items. These items were identified by the 
COMPUSTAT category of Special Items (data item No. 32, quarterly tape), which includes 
adjustments for prior years, significant nonrecurring items, results of discontinued operations, 
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profit or loss on sale of assets, and writedowns or writeoffs. A “special item firm" was identified 
as one for which total special items exceeded ten percent of fourth quarter total expenses (before 
special items), when the special items’ balance was a debit, or ten percent of total revenues (before 
special items) when the balance of special items was a credit. Nine good news firms (out of 171) 
and 58 bad news firms (out of 394) fall into the category of “special item firms.” Recall that our 
earnings surprise measure, which underlies the selection of the sample firms, is based on EPS 
before extraordinary items. Thus, all the “below the line” special charges and transitory items are 
excluded from our sample, while the above selection procedure culls out the large “above the line” 
special items. 

Table 3 reports on the disclosure policies of the sample firms: it classifies the primary 
disclosure frequencies of the firms by type of information. The numbers in parentheses refer to 
the subsample excluding large special item firms. It is evident that disclosure of quantitative 
(“hard”) estimates of forthcoming sales or earnings (disclosure types 1 and 2), is a rare 
phenomenon even among firms soon to surprise investors: only 5.8% and 8.7% of the good and 
bad news firms, respectively, disclosed such information. The disclosure frequency is higher 
when qualitative quarterly earnings or sales estimates (type 3) are included: 9.4 and 21.1% 
(subtotal of category I) of the good and bad news firms, respectively, provided either quantitative 
or qualitative earnings/sales forecasts. When special item firms are excluded (numbers in 
parentheses), the disclosure frequencies are largely unchanged. 

In conformity with Skinner's (1994) findings (based on arandom sample of NASDAQ firms, 
contrasted with our large-surprise NYSE and AMEX sample), the disclosure frequencies in table 
3 are skewed toward bad news firms: the difference between the good and bad news firms’ 
disclosure frequencies of earnings/sales estimates (9.4% vs. 21.1%) is statistically significant at 
the 0.01 level (two-tail t-test). The disclosure asymmetry toward bad news firms holds also for 
type 4 releases: earnings components, writeoffs, load factors, etc. This is the category with the 
largest number of special items, yet even after the exclusion of firms with large special items the 
disclosure frequency of bad news firms is double that of the good news firms (9.8% vs. 4.996; 
numbers in parentheses for disclosure type 4). 

Finally and notably, roughly half our large-surprise firms did not disclose ary operating 
information (types 7 and 8) prior to the earnings announcement. It is clear from this prevalence 
of "silent firms" that warning investors of an impending surprise is not deemed cost-beneficial 
by a considerable number of firms, a finding we will return to in section VI. 


IV. ATTRIBUTES OF DISCLOSING AND NONDISCLOSING FIRMS 


The data in table 3 indicate a considerable variability in disclosure policy, with roughly half 
the sample firms not disclosing any operating information prior to surprising investors. This 
diversity raises the question of what attributes distinguish firms that alert investors to the 
impending earnings surprise from those that keep silent? In contrast with most of previous 
research on discretionary disclosure which considered the choice to be dichotomous—disclose 
or not disclose—we allow for a wider choice, given that managers have a rich menu of disclosure 
types, such as point estimates of earnings, range estimates, or statements on changes in order 
backlog and operating capacity. Operationally, we distinguish among the three major disclosure 
categories in table 3—earnings/sales forecasts, other operating information, and no informa- 
tion—and examine the following question: what factors (attributes) are associated with the 
specific disclosure choices made by managers? A set of hypothesized factors follows. 
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TABLE 3 
Discretionary Disclosures Made by Sample Firms 


Disclosures Made By All Sample Firms (N=565) During a 60-Day Interval Prior to Announcing Fourth 
Quarter Surprises, Classified by Type of Information. Numbers in Parentheses Refer to the Subsample of 
Firms Excluding Those with Large Special Items. 


Disclosure type Good News Firms Bad News Firms 
I. Earnings or sales forecasts 
N % N % 
1. Point estimates of EPS 
or sales 6 (6) 3.5 (3.7) 14 (12) 3.6 (3.6) 
2. Range estimates of EPS 
or sales 4 (4) 2.3 (2.5) 20 (14) 5.1 (4.2) 
3. Qualitative disclosures 
about EPS or sales 6 (6) 3.6 3.7) 49 (43) 12.4 (128) 
| Subtotal 16 (16) 9.4 (9.9) 83 (69) 21.1 (20.6) 
Il. Other operating information 
4, Eamings components, write- 
offs, load factors, backlogs, 
etc. 9 (8) 5.3 (4.9) 52 (33) 13.2 (9.8) 
5. Capital expenditures, new 
products, contracts, etc. 29 (27) 17.0 (16.7) 58 (51) 14.7 (15.2) 
6. Shareholder payouts 20 (19) 1L7(01D 20 (22) 6.6 (0.5) 
Subtotal 58 (54) 34.0 (33.3) 136 (106) 34.5 (31.5) 
IN Nonoperating or no-information 
7. Non-operating information 62 (59) 36.2 (36.4) 103 (93) 26.1 (27.7) 
8. No disclosures made 35 (33) 20.5 (20.4) 12168) 18.3 20.2) 
Subtotal 97 (92) 56.7 (56.8) 175 (161) 44.4 (47.9) 
Total 171 (162) 100% (100%) 394 (336) 100% (100%) 





Size of earnings surprise 


It is widely believed that managers wish, in general, to align investors’ expectations with the 
forthcoming financial results, in order to decrease investors’ transaction costs, avoid large stock 
price fluctuations, and shield analysts from embarrassment.'° Accordingly, we expect that the 


9? See Ajinkya and Gift (1984) and King et al. (1990) for elaboration on the motives and evidence related to this 
expectations alignment ("expectations adjustment") hypothesis; in particular, note the argument that such an expecta- 
tions adjustment reduces investors' transaction costs arising from asymmetric information. However, this may not be 
the only reason for discretionary disclosure. For example, several CEO's indicated in private conversations with the 
authors that anotber important disclosure objective is to "avoid embarrassing our analysts" by surprising them. 
Analysts' ill feelings and loss of confidence in management as a result of such surprises (particularly disappointments) 
apparently impose a heavy cost on firms. 
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wider the expectations gap, the “harder” (more quantitative) and more directly related to earnings 
will be the discretionary disclosure. Stated differently, in order to substantially narrow a large 
expectations gap, the disclosure will have to be more quantitative and specifically related to 
forthcoming earnings (or sales), whereas the closure of a relatively small expectations gap might 
require only the release of a statement of last month’s sales or information about the change in 
capacity utilization. Obviously, managers will be reluctant to issue a quantitative earnings 
forecast, which can be ex-post verified and could lead to loss of reputation and even litigation, 
unless the expectations gap (information asymmetry) is large. 

Indeed, the data in table 4 indicate that the magnitude of earnings surprise decreases 
monotonically (in absolute terms) as one moves from firms issuing earnings-related forecasts, 
through those releasing general operating information, to the non-disclosing firms (except for the 
means of the good news firms). The earnings surprise decrease is more pronounced for the bad 
news firms than for the good news ones.!! Thus, it appears that “hard” warnings are used to narrow 
large expectations gaps, a conclusion consistent with the mean stock price reaction around the 
warning—reflecting the closing of the gap—also reported in table 4. For both the good and the 
bad news firms, the mean reaction to the earnings/sales forecasts is substantially larger (in 
absolute terms) than the mean reaction to the release of general operating information. Signifi- 
cance tests for differences in table 4 data are conducted in the multivariate analysis reported in 
table 5. 


Prior prospective statement 


The existence of a previously released prospective statement by managementis also expected 
to affect the disclosure decision. Such statements create a legal liability to "correct" or “update” 
investors' expectations: 


Those courts that have considered the issue generally have concluded that rule 10b- 
5 [of the Securities Exchange Act of 1934, prohibiting fraudulent conduct in connection 
with the purchase or sale of securities] requires issuers to correct statements that they 
later learn were materially misleading at the time they were made....Commentators 
generally have accepted that rule 10b-5 imposes on issuers a duty to correct, and on 
occasion have suggested issuers have a duty to update as well....statements that by their 
terms purport to continue to be valid beyond the date they were disseminated. (Rosenblum, 
1991, pp. 301, 302, 304, emphasis added). _ 


Thus, when managers have issued prospective statements in the past which are still outstanding 
(1.e., reflected in investors’ expectations), there is a legal liability and consequently strong 
incentives for managers to warn investors of a large earnings surprise. No warning in such cases 
may be construed as failure to correct or update the earlier statement. 

We searched forthe existence of prior prospective statements made by the sample firm which 
disclosed type 1 through 3 messages (i.e., earnings/sales forecasts) through the NEXIS News 
(Wires) file over the three fiscal quarters preceding the one whose earnings are examined in this 
study. The reason for limiting the search for previous forecasts to firms issuing disclosures of type 
1 through 3 during the examined interval is that we consider in this section corrections or updates 
of previous forecasts. Such corrections are generally made by a subsequent forecast (1.e., 
disclosures of types 1—3). It does not seem plausible that other types of disclosure (e.g., a new 
product announcement or a dividend change) will qualify as a correction or an update of a previous 


!! When the earnings surprise is measured by seasonal random walk, the decreasing pattern in table 4 holds for the bad 
news firms, but not for the good news ones. 
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forecast. An earnings or sales forecast (quantitative as well as qualitative) concerning the fourth 
quarter earnings, made during the first three fiscal quarters, was designated as a “prior prospective 
statement.” 

For comparison purposes we also examined the sample nondisclosing firms (types 7 and 8) 
for the existence of a prior prospective statement made during the preceding three quarters. Of 
the sample firms examined, 13 percent of the good news firms and 22 percent of the bad news 
firms, respectively, had prior prospective statements. The distribution of these statements across 
disclosure types accords with the duty-to-correct hypothesis: 25.0 (30.1)% of the good (bad) 
news firms which issued types 1—3 disclosures during the observation window had a prior 
forecast, whereas only 11.3 (17.7)% of the nondisclosing good (bad) news firms (types 7-8) had 
a prior forecast. The difference for the bad news firms— 30.196 vs. 17.7%—is significant at the 
0.04 level by a two-tail t-test. Warning investors ahead of the earnings announcement thus appears 
to be associated with the existence of a prior forecast. 


Industry and regulatory status 


High technology (“high tech”) firms appear to be exposed to a larger-than-average risk of 
shareholder lawsuits, particularly at the early stage of operations." Among the reasons for the 
prevalence of shareholder lawsuits against high tech firms is their relatively high risk, resulting 
in large price fluctuations and potential losses to investors. The aggressive accounting techniques 
sometimes used by such firms (e.g., front loading of gains from long-term contracts, excessive 
capitalization of software development costs) may also contribute to litigation exposure. Given 
this exposure, high tech companies may be motivated to disclose more than firms in other 
industries in order to fend off investors’ suspicion and litigation. Indeed, for the bad news firms, 
14.2% of those disclosing types 1-6 messages were high tech firms, while only 5.1% of 
nondisclosers were high tech firms (the difference is significant at the 0.02 level, two-tail t-test). 
For the good news firms this difference is statistically insignificant. 

Regulated firms (e.g., utilities, banks) provide to regulators, and thereby indirectly to the 
general public, a considerable amount of operating information. Such information is often more 
detailed and more timely than quarterly financial reports. It can, therefore, be expected that 
regulated firms encounter less information asymmetry with investors than other companies, and 
will therefore engage in a lower level of discretionary disclosure. 


Firm size and risk 


Firm size was found in previous studies to be associated with the frequency and quality of 
corporate disclosure (e.g., Lang and Lundholm 1993). Economies of scale in disclosure may 
contribute to the tendency of large firms to disclose more than small ones. Furthermore, large 
firms are more exposed to litigation (having “deeper pockets") than small ones, and hence may 
disclose more to deter litigation. Indeed, for the good (bad) news firms, the median size (market 
capitalization at the beginning of the examined quarter) of those disclosing type 1—3 messages 
is $418.1 million ($423.1 million), while that of nondisclosers (type 7—8) is $228.6 million 
($202.2 million). The difference in medians for the bad news firms is statistically significant at 
the 0.00 level (Wilcoxon test), while the difference for the good news firms is insignificant. 


7 O'Brien and Hodges (1991) report that close to 30 percent of shareholder lawsuits during 1988-91 were filed against 
high tech companies, while these companies constitute only about 10 percent of the Industrial COMPUSTAT firms (and 
also about 10 percent of the firms in our sample). Also, the Treadway Commission on Fraudulent Financial Reporting 
(1987) suggests that a firm's membership in a high tech industry should be considered by auditors a warning signal for 
financial reporting problems. 
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Volume of trade may also be associated with disclosure. Other things equal, the larger the 
volume of trade the larger the potential damage claim by shareholders alleging that they 
purchased the stock at an inflated price. Hence, firms with large share volume may disclose more 
to deter litigation. Also, financial analysts are known to vigorously demand information from 
managers. It can, therefore, be expected that the larger the number of analysts following the firm, 
the larger the amount of discretionary information disclosed by it. We consider these three factors 
(size, volume, and number of analysts) together since they are highly correlated, and accordingly, 
we included only one of them in each iteration of the following analysis. Finally, firm risk was 
found in previous studies to be associated with the extent of discretionary disclosure (e.g., Lang 
and Lundholm 1993). We therefore included risk, measured by the stock’s B value, as a control 
variable. 


Estimation results 


The association between firms’ disclosure policies and the above factors and attributes is 
examined by a multi-response logit model (Amemiya 1981), where the dependent variable 
(response) is the 3-way categorization of disclosure: earnings/sales forecasts (types 1-3), 
operating information (types 4—6), and no-disclosure (types 7—8). The independent variables are 
the factors and attributes outlined above. Table 5 presents estimates of a separate multi-response 
logit model for the good and the bad news firms. (We comment below on a combined good/bad 
analysis). 

For the good news firms, firm sizeis the only attribute significantly associated with disclosure 
category (perhaps because of the relatively small number of good news firms—171). The 
coefficient sign is positive, as expected, indicating more frequent and specific disclosures as firm 
size increases. For the bad news firms, more attributes are statistically significant with the 
expected coefficient sign: the size of the earnings surprise (expectations gap), firm size, 
membership in a high tech industry, and being a regulated firm. Considering the earnings 
surprise variable, the data in table 4 (mean and median surprises) and the 3-way analysis in table 
5 suggest that managers choose the type of disclosure (direct earnings/sales forecast, operating 
information, or no-disclosure), at least in part, according to the severity of the expectations gap." 

To examine the significance of the "prior prospective statement" variable, a 2-way logit 
analysis was conducted, where the dependent variable took the value of 1 for firms disclosing 
types 1—3, and 0 for nondisclosers (types 7—8). The reason for this separate analysis is that firms - 
whose primary disclosure was of type 4—6 were not searched for the existence of a previous 
prospective statement (as explained above). The previous prospective statement variable turned 
out to be marginally significant (with the expected positive coefficient sign): p-values of 2-tail 
t-test were 0.11 for good news firms and 0.09 for bad news firms. All other variables had similar 
values to those in table 5. 

Running the logit analysis on the combined sample of good and bad news firms, and adding 
to the independent variables a disclosure dummy (0 for good news and 1 for bad news firms), 
yielded practically identical results to those in table 5. Size of earnings surprise (p-value = 0.00), 
existence of a prior forecast (p-value = 0.03), firm size (p-value = 0.00), high tech firm (p-value 
= 0.03), regulatory status (p-value = 0.00), and the disclosure dummy (p-value = 0.05) were all 


D The negative coefficient sign of the earnings surprise is due to the fact that all the surprises in this group are negative. 

^ When we ran the multi-respoase analysis with volume (average of log monthly number of shares traded, divided by 
shares outstanding), or with the number of analysts following (from IBES), substituting for firm size, each of these 
variables was statistically significant for both good and bad news firms, at the 0.05 level. 
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TABLE 5 
Disclosure Attributes 


Multi-response logit analysis. Dependent variable takes the values of the three disclosure categories: 
earnings/sales forecasts (Types 1—3), operating information (Types 4—6), and no disclosure (Types 7-8). 
The independent variables are the attributes defined in footnotes. The values of the estimated response 
coefficients and their 2-tail p-values of asymptotic t-statistics (in parentheses) are presented in the table. 


Independent Good News Bad News 
Variable Firms Firms 
Intercept 1 -4.889 (0.00) -3.945 (0.00) 
Intercept 2 -2.790 (0.00) -2.235 (0.00) 
Eamings Surprise 4,573 (0.36) -3.194 (0.00) 
Firm Size 0.336 (0.00) 0.356 (0.00) 
Risk (8) 0.373 (0.48) 0.160 (0.63) 
High Tech 0.132 (0.83) 0.697 (0.03) 
Regulation -0.338 (0.39) -0.795 (0.00) 
No. of Observations 171 394 
Chi-Square 11.82 (0.04) 49.84 (0.00) 
Variable definition: 


Earnings surprise: Reported fourth quarter EPS (before extraordinary items) minus analysts’ forecast of EPS, deflated 
by stock price at the beginning of the fourth quarter. 
Firm Size: Log of total market value of the firm at the beginning of the fiscal quarter examined here. 
Risk: The stock's B, generally estimated over 60 months (at least 36) prior to the quarter examined. 


High tech: 1 when the sample firm belongs to: Drugs (COMPUSTAT SIC codes 2833-2836), R&D Services 
(8731-8734), Programming (7371—7379), Computers (3570-3577), Electronics (3600-3674); and 
0 otherwise. 

Regulation: 1 when the firm belongs to: Telephone (COMPUSTAT 4-Digit SIC codes 4812—4813), TV (4833), 


Cable (4841), Communications (4811—4899), Gas (4922—4924), Electricity (4931), Water (4941), 
Financial firms (6021—6023, 6035-6036, 6141, 6311, 6321, 6331); and 0 otherwise. 





found to have the expected sign and to be statistically significant. The positive coefficient of the 
disclosure dummy is consistent with table 3 findings that bad news firms disclose more than good 
news firms. 

When the multi-response analysis reported in table 5 was run on the sample firms excluding 
those with large special items (9 firms with good earnings news, 58 with bad news), the only 
change occurred for the “prior forecast" independent variable. While this variable became more 
statistically significant for the good news firms (p-value = 0.07), it lost its significance for the bad 
news firms. Apparently, many of the warnings “correcting” or "updating" previous forecasts were 
made by firms with large special items. When we replicated the multi-response analysis, 
measuring the earnings surprise by a seasonal random walk (fourth quarter EPS minus the 
preceding year’s fourth quarter EPS, divided by price), results were virtually identical to those 
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reported in table 5. The only notable change was that the regulation variable became significant 
(p-value = 0.03) for the good news firms too. 

Overall, for bad news firms the size of the earnings surprise, the existence of a prior 
prospective statement, membership in a high tech or a regulated industry, and firm size appear to 
be the most consistent disclosure attributes. For good news firms, only firm size and a previous 
forecast are associated with disclosure type. With respect to inferences concerning the earnings 
surprise variable, one should recall that our sample is nonrandom, being composed of firms 
experiencing relatively large surprises. 


V. INVESTOR REACTION TO WARNINGS 


Thus far, we examined pre-surprise releases primarily from the disclosers' perspective, that 
is, the types of messages released and firm attributes associated with the information communi- 
cated. We now consider the recipients’ reaction to these disclosures, particularly focusing on the 
following question: do investors react differently to a large earnings surprise which is commu- 
nicated gradually (i.e., preceded by warnings) than to a surprise that occurs without prior 
warning? In efficient and perfect capital markets, there should not be any difference in investor 
reaction to the two disclosure policies. However, given prevalent evidence on capital market 
anomalies with respect to earnings announcements (e.g., Bernard and Thomas 1989), an 
empirical examination of the reaction to warned and unwarned earnings surprises is called for." 
This comparison is conducted over the following three return windows, with a focus on the 
"combined warning + earnings window." 


(i) The long window: spanning from the 31st calendar day after the third quarter earnings 
announcement (see time line in Section II) through the second trading day after the fourth 
quarter earnings announcement (E,+2 days). This window encompasses all the discretionary 
disclosures made by the firm within the observation window as well as the fourth quarter 
earnings announcement. l 

(ii) The combined warning + earnings window: includes five trading days around the release of 
the “primiary disclosure” (i.e., the highest ranked discretionary disclosure of types 1 through 
6, see Section IM), plus five trading days around the fourth quarter earnings announcement. 
This window reflects the combined reaction to the discretionary disclosure and to the 
earnings announcement. This combinationis necessitated by theobvious interaction between 
the two disclosures (i.e., a warning preempts, to some extent, the reaction to the earnings 
announcement). Comparing the combined window returns of disclosing with those of non- 
disclosing firms poses a problem, since the latter (i.e., firms with disclosure types 7—8) do not 
have a “primary disclosure” return. Accordingly, in order to compare an equivalent 10-day 
return window for disclosing and non-disclosing firms, we added to the latters’ five-day 
earnings announcement return the average, firm-specific five-day return (drift) over the 
discretionary disclosure interval (see time-line in Section II).'* Consequently, the combined 
window for each sample firm, disclosing as well as non-disclosing, is of 10-trading day 
length. 

(i) The earnings announcement window: five trading days (-2 through +2) around the fourth 
quarter earnings announcement (E,). 


In private conversations we had with several CEOs, some expressed the belief that by alerting investors to a forthcoming 
surprise an overreaction is avoided. 

!5 For each nondisclosing firm the return over the disclosure interval (from E,+31 to E,-2) was first computed, and then 
a five-day return was measured. The mean five-trading day drift for the non-disclosing/good news firms was 0.30 
percent, and for the non-disclosing/bad news firms was -0.53%. 
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The cumulative market-adjusted returns (the firm’s raw return minus the corresponding CRSP 
equally-weighted market return) were computed over the three windows for each sample firm. 

To examine investors’ reaction to warnings, we cross-sectionally regressed market-adjusted 
returns on four independent variables: (i) an intercept warning dummy: 1 for firms making 
disclosure types 1—6, and O for non-disclosing firms (disclosure types 7—8), (ii) a slope warning 
dummy: the disclosure dummy in (1) times the fourth quarter earnings surprise, (iii) the fourth 
quarter earnings surprise, and (iv) firm size, measured as the log of total market value of the firm 
at the beginning of the fourth quarter. (The regression equation is presented in table 6). The first 
two independent variables in the regression reflect the act of warning (expected to enhance 
returns), while the latter two variables, which are generally associated with stock price reaction 
to announcements, were incorporated in the regression to mitigate the missing variable problem 
(e.g., the warning variable may be statistically significant because it proxies for firm size). Given 
the three stock return windows specified above, three cross-sectional regressions were run for the 
good news and for the bad news firms. Regression estimates are presented in table 6. 

For the "long window” regressions (top row in each panel), the only statistically significant 
variable is firm size, for the good news firms. However, when we focus on investors' reaction to 
specific disclosures (return windows around announcements), results are more revealing. For the 

combined window” regressions (i.e., reaction to the warning and to the subsequent earnings 
disclosure), the earnings surprise coefficient (b,) is statistically significant for both the good and 
bad news firms (and size is again significant for the good news firms). Furthermore, the intercept 
disclosure dummy (coefficient b,) has a negative and statistically significant coefficient (p-value 
is 0.04) for the bad news firms. Thus, the act of warning investors of an impending earnings 
disappointment is associated with a lower stock return. This counterintuitive finding is examined 
further below." 

The earnings announcement regressions (bottom row in each panel of table 6) yield 
insignificant coefficients for all variables in the good news firms' analysis, but for the bad news 
firms (panel B), both the earnings surprise, b,, and the disclosure slope dummy, b,, coefficients 
are highly significant. The opposite sign of these coefficients reflects the preemption effect of the 
warning (noted by Skinner 1994): while forthe no-warning firms the earnings surprise coefficient 
is 0.172, for the warning firms it is -0.074 (the sum of the coefficients 0.172 and -0.246). 

To examine the robustness of these findings, we replicated table 6 regressions limiting the 
warnings to the “hardest” disclosure types 1—3. Results are essentially identical to those reported 
in table 6. We also replicated the regression analysis with the earnings surprise variable, UE,, 
measured by a seasonal random walk, rather than against analysts' forecasts. Again, results are 
unchanged from those reported in table 6. Finally, we replicated the regression analysis on the 
subsample of firms excluding those with large special items. Once more, there was no significant 
change in the findings. 

Given the unexpected finding (table 6, panel B), that the release of a warning by bad news 
firms is associated with a lower combined return (around the warning and the earnings 
announcement), we reexamined this outcome using a matched sample design. Specifically, each 
disclosing firm was matched with a non-disclosing firm by two matching criteria: the extent of 
earnings surprise and firm size. Controlling for these variables allows one to focus on the 
incremental reaction to the warning since, in general, earnings surprise and size are strongly 


7 We consider this finding counterintuitive because a warning generally provides partial information about the subsequent 
earnings surprise, and is therefore expected to be rewarded by investors. 
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TABLE 6 
Market Reaction to Warnings 
Coefficient estimates (and p-values of t-test in parentheses) of regressions of market-adjusted returns on 
disclosure-related variables: ' 


MARET, =at b, *UE, + b,*DISC, + b,*(UE*DISC),, + b, "SIZE, +€, 
Panel A: Good News Firms 


Long Window Market Adjusted Returns 
(from 60 days before earnings announcement through two days after it) 


Coefficient a b, b, b, b, adj. R? 
Coefficient Estimate 0.115 0.779 -0.023 -0.700 -0.015 0.06 
p-value* (.03) (.13) C51) (.33) (.07) 


Combined Window Market Adjusted Returns 
(5 days around eamings announcement plus 5 days around the warning) 


Coefficient a b, b, b, b, adj. R? 
Coefficient Estimate 0.051 0.412 0.012 -—0.202 -0.006 0.04 
p-value* (.03) (.07) (.43) (.52) (.07) 


Short Window Market Adjusted Returns 
(5 days around eamings announcement) 


Coefficient a b, b, b, b, adj. R? 
Coefficient Estimate 0.017 0.219 -0.014 -0.121 -0.001 0.01 
p-value* (.39) (.28) (.30) (.67) (77) 


Panel B: Bad News Firms 
Long Window Market Adjusted Returns 
(from 60 days before earnings announcement through two days after it) 


Coefficient a b, b, b, b, adj. R? 
Coefficient Estimate -0.060 0.128 -0.013 0.006 0.001 D.01 
p-value* (.08) (.21) (50) ., (.96) (.80) 


Combined Window Market Adjusted Returns 
(5 days around earnings announcement plus 5 days around the warning) 


Coefficient a b b, b, b, adj. R? 
Coefficient Estimate -0.031 0.157 -0.020 -0.060 0.003 0.06 
p-value* (.06) (.00) (.04) (.33) (.26) 


Short Window Market Adjusted Returns 
(5 days around earnings announcement) 


Coefficient a b, b, b, b, adj. R? 
Coefficient Estimate 0.001 0.172 -0.003 0.246 -0.002 0.07 
pvalue* . (.97) (.00) (.66) (.00) (.40) 


MARET, - market adjusted returns for each of the three return windows examined. 

UE, = the earnings surprise scaled by share price. 

DISC, = warning dummy: 1 for firms disclosing types 1—6, and 0 for firms with types 7-8. 
SIZE, = log market value at the beginning of the fourth fiscal quarter. 

` * Of two-tail t-test 
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associated with investors’ reaction to announcements." Furthermore, controlling for the extent 
of the earnings surprise reduces the likelihood that a nonlinear returns-earnings relation is driving 
the results. In contrast with the earlier regression analysis, the matched sample design also 
provides a direct estimate of the differential returns to disclosing and non-disclosing firms. 

Table7 presents mean and median market-adjusted returns of the bad news firms that warned 
investors prior to the earnings announcement and the matched firms that did not issue a warning.!? 
Note that the size of the ings surprise of the two groups is almost identically matched at both 
the mean and median, while firm size is closely matched at the median, but less so at the mean 
(warning firms are somewhat larger than no warning firms). While the return differences between 
warning and no warning firms over the “long” and the “earnings announcement” windows are 
statistically insignificant (largely consistent with the regression results in table 6); the differences 
for the “combined window” (warning + earnings announcement) returns, reported in the second 
column from the right in table 7, are significant. The direction of difference is consistent with that 
of the regression analysis: for example, for warning firms defined as those releasing the “hardest” 
disclosures (types 1—3, panel A), the combined mean (median) market-adjusted return is -6.2 
(-7.0)%, while the mean (median) returns of the matched no warning firms is only -2.0 (-2.4)%. 
The differences in both the means and medians are statistically significant (p-values: 0.01 and 
0.08). When disclosing firms are defined more broadly as those releasing disclosure types 1-6 
(panel B), the return differences are in the same direction (lower for warning firms) but somewhat 
smaller in magnitude: -4.2 vs. -2.0% at the mean. While the mean difference is statistically 
significant, the median difference is not. 

The market reaction analysis was mainly aimed at addressing whether investors react 
differently to earnings surprises preceded by warnings than to unwarned surprises. Both the 
regression and the matched sample analysis suggest a different reaction for bad news firms— 
more negative for warning than no warning firms. The following section examines possible 
explanations for this finding. 


VL EXPLAINING INVESTOR REACTION 


It is possible that the larger negative return to warning firms is related to "special items" 
announcements, given that a relatively large numberof bad news firms issued such warnings (e.g., 
of asset writeoffs and restructuring charges). However, this is not the case, since our analysis 
shows that for the subsample of firms excluding those with large special items, the “combined 
window” returns of the warning firms (warning types 1—3) is -5.7% vs. -1.6% only for the no 
warning firms (the difference 1s significant at a p-value of 0.01, two-tail t-test). Another possible 
explanation for the apparent negative reaction to warnings is that our nondisclosing firms released 
information which was not recorded by our information search G.e., NEXIS). For example, it is 
possible that some of our “no warning firms” communicated negative pre-earnings information 
via conference calls to financial analysts. While we cannot rule out such a possibility, it does not 
appear to explain our findings. Specifically, if the no warning firms released information during 
the examined interval which was not captured by our search, then their long-window returns (from 
-60 to +2 days around the earnings announcement) should have been close to the combined 
window return of the warning firms. Again, this is not the case: the mean long window return of 


18 See, for example, Foster et al. (1984) for evidence on the association between investors' reaction to earnings surprise 
and firm size. 

19 For the good news firms the matched-pair analysis did not yield statistically significant results, and therefore the results 
are not presented in table 7. The only significant difference is for the five-day return reaction around tbe earnings 
announcement: 0.005 vs. 0.020, for disclosing and non-disclosing firms, respectively. This difference is apparently due 
to the preemption effect of the voluntary disclosure. 
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TABLE 7 


Returns to Warning and Matched No-Warning Firms 


Mean and median market-adjusted returns over three windows for warning and no-warning firms, 
matched by earnings surprise and firm size. 


Panel A: Bad News Firms Disclosure = Types 1-3 


N 
Means 

Warning Firms 64 
No Waming Firms 64 
p-value: t-test! 
p-value: Wilcoxon Rank-Sum? 

Medians 
Warning Firms 


64 
No Warming Firms 64 
p-value: Difference in 
Medians 


Earnings 
Surprise 


-0.080 
-0.079 


.95 


Panel B: Bad News Firms Disclosure = Types 1-6 


N 
Means 

Warning Firms 110 
No Warning Firms 110 
p-value: t-test! 
p-value: Wilcoxon Rank-Sum? 

Medians 
Warning Firms 110 


No Warning Firms 110 
p-value: Difference in 
Medians 


Market Adjusted Returns 
Long Combined 5-day 
Window Return Earnings 
-0.053 -0.062 -0.008 
-0.033 -0.020 -0.019 
.53 .01 .36 
43 .00 .78 
-0.086 -0.070 -0.021 
-0.059 -0.024 -0.021 
.48 .08 .99 

Market Adjusted Returns 
Long Combined 5-day 
Window Return Earnings 
-0.056 -0.042 -0.008 
-0.030 -0.020 -0.019 
.23 .04 17 
33 05 43 
0.064 -0.034 -0.015 
-0.059 -0.024 -0.022 
.79 .28 42 


! Two sample t-test (two tail) of difference in means between warning and no warning firms. 


? Wilcoxon's rank-sum test for identical populations. 
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the no warning firms (table 7, panel A) is only -3.3% vs. -6.2% mean “combined window” return 
for the warning firms (p-value of difference is .15, by a two-tail t-test). 

What then explains the more pronounced negative investor reaction to warnings, relative to 
no warning? It seems plausible that managers will issue a warning when they perceive the 
forthcoming earnings disappointment to be permanent, while transitory disappointments may go 
largely unwarned. Examining this explanation requires a measure of earnings permanence other 
than the market reaction (which is being explained here). We chose the revision in analysts’ 
forecast of next year’s annual earnings around the fourth quarter’s disclosures as the measure of 
permanence of the earnings surprise. Specifically, we collected for the sample firms in panel A 
of table 7 the Value Line forecasts of fiscal year t+1 earnings made after fiscal year t third quarter 
earnings announcement, E, (but before the primary disclosure), and me revised fiscal year t+1 
forecasts made after fiscal t fourth quarter earnings were released, E, .~ Thus, as a proxy for 
earnings permanence, we record the revision in the forecast of tl eamings around the 
discretionary releases (warnings) as well as the fourth quarter earnings announcement. The 
revision is measured for each firm as the later forecast minus the earlier forecast, divided by the 
stock price at the end of the third quarter of fiscal year t. 

If, as we conjectured, warnings tend to be issued for permanent earnings disappointments 
(and if the degree of permanence is reflected by the extent of forecast revisions), then the revision 
of the earnings forecasts of warning firms will be larger than the forecast revision of the no 
warning firms. This indeed is the case as evidenced by the following data: - 


Forecast Revision 
Mean Median 
Warning firms (N54)?! -0.067 -0.033 
No warning firms (N54)! -0.027 -0.012 


Both the mean and median forecast revisions of the warning firms (-0.067 and -0.033) are more 
negative than the mean and median revisions of the no warning firms (-0.027 and -0.012). The 
differences are statistically significant (by the two-sample t-test forthe means at a p-value of 0.07, 
and by the two-sample Wilcoxon test at a p-value of 0.03). The evidence is thus supportive of the 
conjecture that the more negative investor reaction to warnings is related to the permanence of 
the earnings disappointment preceded by warnings. 

However, the above conclusion does not preclude other conjectures for the more pronounced 
negative reaction to warnings. One possibility is that investors read into the warning more than 
managers intend. While a warning generally relates to the forthcoming earnings announcement, 
it may raise concerns about firm competitiveness and long-term viability.“ A warning of a 


2 We prefer Value Line over IBES for the measurement of the revision in fiscal t+ 1 forecasts of earnings, since with IBES 
data the individual analysts providing the forecasts often change over time. Thus, the two forecasts of t+1 may be 
provided in part by different persons, thereby introducing an additional noise into the forecast revision measure (i.e., 
reflecting the different idiosyncracies of each forecaster). 

? In panel A of table 7 there are 64 matched pairs of firms. We could not find Value Line forecasts for 10 paits, hence 
data on forecast revisions are based on 54 pairs. 

2 A case in point is Apple Computer's June 9, 1993 warning that "second half results could fall short of expectations and 
year-carlier profits.” The 10.6% drop in Apple's stock price on that day (bringing down Compaq's stock as well) was 
in part due to analysts’ concerns with Apple’s long-term competitiveness, obviously extending beyond the 1993 second 
half earnings. This is made clear by analysts’ pronouncements on the day of warming (reported by Reuters, and derived 
by us from NEXIS, June 9, 1993). For example: “Apple Computers Inc. loses too many batties in the computer price 
war.... The Macintosh is not as distinctive as it was, they are going to have to cut prices much more aggressively.... 
The price war is not going to go away any time soon because there are still too many competitors offering products at 
cut-rate prices." 
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forthcoming disappointment and elaboration on its causes, may lead investors to a comprehensive 
reassessment of the firm's competitive position, and perhaps even to an overreaction to the bad 
news. Recall the intriguing question raised earlier: if discretionary disclosure, and warning of 
earnings disappointments in particular, are as beneficial as often claimed in the literature (e.g., 
deterring shareholder litigation; Skinner 1994), why do not all firms warn investors before 
releasing disappointing earnings?” Managers’ concern with an overreaction to warnings may 
provide a clue to this question. 


VIL CONCLUDING REMARKS 


In this study, we examined the disclosure policies of managers facing large earnings sur- 
prises, by observing all corporate communications made public over a 60-day interval preceding 
the earnings announcement. Roughly half the sample firms did not provide any operating 
information prior to surprising investors, while less than ten percent provided quantitative point 
or range estimates of earnings or sales. The remaining firms provided various types of operating 
information prior to surprising investors. Bad news firms (whose earnings were below analysts’ 
forecasts) released significantly more discretionary disclosures than good news firms. 

The type of message communicated appears suited to the extent of the forthcoming surprise 
(closure of the expectations gap): the larger the surprise (particularly so for disappointments), the 
more quantitative and earnings-related the disclosure. Additional attributes positively associated 
with the likelihood ofissuing a warning by bad news firms are firm size, the existence of a previous 
forecast and operating in a high tech industry. For good news firms, only firm size and the 
existence of a previous forecast are associated with likelihood of alerting investors to the 
forthcoming surprise. Managers appear less concerned with narrowing positive expectation gaps 
than with avoiding large earnings disappointments. This asymmetric disclosure raises interesting 
issues for further study. For example, the relative dearth of disclosure concerning positive 
earnings surprises adversely affects shareholders selling prior to the earnings announcement. Do 
managers have a duty to inform such shareholders of the forthcoming stock price increase upon 
announcement of the earnings surprise? 

When the combined investor reaction to the warning and the subsequent earnings announce- 
mentis compared with the corresponding reaction for a matched sample of no warning firms, the 
former is significantly more negative than the latter. An examination of forecast revisions around 
the warnings and the subsequent earnings announcement indicates that warnings tend to be issued 
when the earnings disappointment is more permanent, consistent with the more negative returns 
observed for warning firms relative to “silent” ones. The tendency to issue warnings for per- 
manent disappointments and leave transitory disappointments largely unwarned, provides one 
explanation to the large number of firms (about half our sample) that kept silent before surprising 
investors. However, additional explanations to the observed differential investor reaction cannot 
be ruled out, such as investors attaching to a warning long-term consequences and perhaps even 
overreacting to it. The related issues of a possible overreaction to warnings and the large number 
of firms which keep silent before surprising investors deserve further examination. 


? Our findings (table 3) indicate that roughly half the large- surprise firms did not disclose any operating information prior 
to the earnings announcement. 
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I. INTRODUCTION 


Prior studies suggest that firms can lower their cost of capital by increasing their disclosure 
of credible information.! Combining this prediction with the additional assumption that the 
benefits of a lower cost of capital are greater for firms that use external financing more frequently, 
we would expect firms that finance more frequently to forecast their earnings more frequently. 
A competing force, however, is the threat of litigation which could increase the cost of a 
management forecast, particularly a favorable one shortly before an offering. Generally, for suits 
concerning forecast accuracy, the plaintiff's attorney must show reliance on a false or misleading 
statement, that this reliance resulted in damages to the shareholder, and that management acted 
with scienter.? However, companies that make public offerings are subject to a second layer of 
regulation that reduces the plaintiff s burden of proof and thus increases the company's expected 
litigation costs shortly before an offering.? By examining management forecast behavior and its 
relation to external financing, this study provides evidence on the relative strengths of these two 
competing forces.* 

Specifically, this paper examines the association between firms' external financing decisions 
and their tendencies to disclose numeric or qualitative earnings forecasts. First, we consider 
whether earnings forecasts are related to firms' long-run tendencies to access capital markets. 
Second, turning to the short-run, we examine whether earnings forecasts are related to particular 
financing events. Third, we examine the character of earnings forecasts, such as their bias, shortly 
before external financing transactions. This research extends the discretionary disclosure 
literature, which examines firms' motivation to disclose information and the characteristics of 
information voluntarily disclosed.) It also extends the literature on securities offerings, by 
presenting a richer description of management's attempts to influence the information available 
at the time of an offering. 

The primary finding of our study is that the tendencies to issue management forecasts and 
finance externally are positively associated over long periods of time. However, conditional on 
an offering, we find that firms are not more likely to forecast in the period immediately prior to 
offerings than at other times. In addition, forecasts made before offerings are unbiased predictions 
of subsequent earnings. Disclosure regulationis often motivated by the argument that, due to free- 
rider problems between investors or management's unwillingness to reveal unfavorable informa- 
tion, too little information will be available to investors. Our research suggests that managers 


! See Cragg and Malkiel (1982), Verrecchia (1983), Diamond (1985; 1989), Merton (1987), Fishman and Hagerty (1989), 
King et al. (1990), and Healy and Palepu (1993). 

2 See Bloomenthal et al. (1985) or Walton and Brissman (1991) for a discussion of rule 10b-5 of the Securities and 
Exchange Act of 1934, which has generally been the legal basis under which suits for damages are brought when 
management forecasts turn out to be inaccurate. 

* Companies that make public offerings are subject to regulation under the Securities Act of 1933, and rule 10b-6 of the 
1934 Act, as well as rule 10b-5. Under these regulations, the plaintiff's attorney need not prove scienter or reliance, and 
(most typically) tbe stock price decline from the date of the offering to the date of suit is taken as the initial measure of 
damages. The burden is on the defense to show that other factors contributed to the price decline. 

4 We consider managers’ public forecasts only, so we do not control for the possibility that managers leak their private 
information through intermediaries such as equity analysts or investment bankers shortly before an offering. However, 
if these leaks were as credible as forecasts, managers would never risk litigation by issuing public forecasts. Thus, our 
tests should be viewed as examining whether the incremental benefits of offering a more credible public forecast exceed 
the litigation costs. 

5 Related management forecast studies include Patell (1976), Penman (1980), Waymire (1984), Pownall and Waymire 
(1989), McNichols (1989), and Pownall et al. (1992). 

$ See Beaver (1977) for further discussion of this issue. 
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believe more disclosure enhances firm value, in that they are more likely to disclose an earnings 
forecast if they regularly access capital markets. In this respect, market forces appear to provide 
incentives for more disclosure. Our research also suggests that management’s forecasts prior to 
public offerings are unbiased, so existing anti-fraud statutes and/or reputation costs may be 
sufficient to deter optimistic forecasts. 

This research is most closely related to Ruland et al. (1990), which documents that firms 
issuing management earnings forecasts are more likely to finance externally in the subsequent 
three months than firms in a comparison sample which issued no earnings forecasts.’ This finding 
does not clarify whether firms forecast to influence the information investors have at the time of 
a specific offering or forecast as part of a longer-term disclosure policy. Our results suggest the 
latter. Financing firms forecast more frequently than non-financing firms, but they are not more 
likely to forecast shortly before financing than at other times.’ 

The layout of the paper is as follows. Section II describes the data. Sections III-V describe 
the tests and results related to our three research questions, and section VI presents our 
conclusions. 


II. DATA 


As shown in table 1, the sample firms were selected from the 2,394 firms on the 1983 
COMPUSTAT Annual Industrials File by imposing three criteria:? first, to coincide with the data 
available on mangement forecasts, the company must be traded on the American or New York 
Stock Exchange, eliminating 213 firms; second, 22 entries corresponding to companies with 
projected financial statement data were eliminated because they were not separate entities for the 
period under study and thus were not included in the management forecast data base; and third, 
an additional 279 companies with missing asset data from 1980 to 1983 were eliminated because 
subsequent tests require asset information, leaving 1,880 firms. 

Dataon the external financing transactions of these firms, from January 1, 1980to December 
31, 1984, were acquired from Securities Data Corporation.'? The distribution of offerings over 
the sample period appears in panel B of table 1.!! Eighty-five percent of the offerings are common 
stock or non-convertible bonds, and the remainder are mainly preferred stock or convertible 
bonds. Considerable variation exists over time in the number of offerings per year. For example, 
sample firms had twice as many common stock offerings in 1983 as in 1980. 

Panel C displays the frequency of financing transactions per firm over the 1980—83 period. 
Approximately half of the sample firms made no offerings during the sample period. The 
remainirig firms financed externally 3,082 times during the four year period with a mean (median) 
of 3.3 (2) financing transactions. The fact that many firms that finance do so frequently suggests 


T Gibbins et al. (1990), based on interviews with 11 firms and nine other organizations, also report that frequency of use 
of financial markets influences disclosure policy. 

* Our analysis also differs from Ruland et. al. in that we discuss legal liability issues, analyze the tendency to issue 
favorable forecasts around offerings, and study a broader sample of firms (1,840 vs. 292) and a larger set of capital 
offerings (3,319 vs. 26). 

? We examine the 1980-83 period because management forecasts, which are costly to collect, were available for this 
period from another study. 

t0 Most of our analysis focuses on financing data for the 1980—83 period, but, as we will describe later, we use data on 
1984 financings in one of our tests and include them here for completeness. 

1 The number of primary, secondary, and combined offers is not available for 62 percent of the transactions. However, 
for the available data, 63 percent are primary issues; 24 percent are secondary issues (large block sales by existing equity 
holders), and 13 percent are combined issues. 
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Panel A: Sample Selection 


Firms on the 1983 Compustat Annual Industrials File 


TABLE 1 


Sample Information 


Less Firms not on the American or New York Stock Exchange 


Less Firms with Projected Financial Statements 


Less Firms Missing Data on Assets 


Number of Firms in the Sample 


Panel B: Number of External Financing Transactions by Year and Type 


Year 


1980 
1981 
1982 
1983 
1984 


Total 


Common 
Stock 


Bonds 


177 
321 


Other 


Panel C: Frequency of External Financing Transactions over the Sample Period 


Number of Firms 


946 
412 
204 
110 
59 
110 
39 


1880 


Catal 


Number of Financing Transactions 


& Ww re © 


5-9 
10 or more 


(Continued) 
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TABLE 1 (Continued) 
Panel D: Number of Management Forecasts by Year and Type 








Year Numeric Qualitative Total 
1980 172 394 566 
1981 134 379 513 
1982 204 389 593 
1983 159 395 554 
Total 669 1557 2226 


Panel E: Frequency of Management Forecasts over the Sample Period 


Number of Firms Number of Forecasts 
1040 0 
336 1 
188 2 
116 3 
68 4 
.48 5 
84 more than 5 forecasts 
1880 


This table shows the sample selection critería imposed, and the distribution of financing transactions and management 
forecasts by type and year. External financing transactions are classified as “Common Stock,” “Bonds” or “Other,” where 
“Other” includes primarily preferred stock and convertible bonds. Management forecasts are classified as numeric or 
qualitative, depending on whether sufficient information was disclosed to derive a point forecast 
of earnings. 


that these firms anticipate their demand for external financing, a necessary condition for this 
demand to have an effect on their long-term disclosure policy.” 

Management forecast data were obtained from the Dow Jones Retrieval Service, which accesses 
articles from The Wall Street Journal, Barrons, and unpublished announcements appearing on the 
Broad Tape. Articles were searched for keywords common to management earnings forecast, such 
as "expects'' and “earnings.” Flagged articles were then searched for a forecast meeting the following 
criteria: first, it must be made on or before the last day of the fiscal year, to distinguish forecasts from 
preliminary earnings releases; second, it must be attributed to a company official, to exclude fore- 
casts made by analysts or others not speaking for the firm; third, the firm's shares must be traded 
on the American or New York Stock Exchange, to ensure availability of stock price data; and fourth, 
numeric forecasts are included if they are the first forecast released by the firm for the fiscal year, 
to exclude subsequent statements of the same forecast.'* 


2 The Spearman rank correlation of the number of offerings per firm with the cumulative offering proceeds scaled by total 
assets is 0.88, suggesting substantial correspondence between financing frequency and the total amount offered through 


time. 

P'The numeric forecasts were examined in McNichols (1989). The qualitative forecasts refer to the forthcoming annual 
earnings numbers. : 

M'This may result in some error in our proxy for the number of forecasts issued during the sample period. For this reason, 
our published results are based on whether a single forecast was issued during the event period. Results based on the 
number of forecasts issued are comparable to the published results. 
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For the numeric forecasts, analysts’ forecasts immediately prior to management’s forecast 
were collected from the 1985 Institutional Brokers Estimate System data base and from Standard 
and Poor's Earnings Forecaster.’ Earnings data were collected from the 1984 COMPUSTAT 
Industrials File, Moody’s Industrials, and The Wall Street Journal. 

The distribution of management forecasts over the sample period appears in panel D of table 
1. The sample contains 669 numeric and 1,557 qualitative forecasts of earnings. While the number 
of firms issuing common stock nearly doubled over the sample period, the number of firms issuing 
numeric forecasts remained almost constant. This suggests that factors other than external 
financing behavior influence firms’ forecasting behavior within individual years. The frequency 
of management forecasts per firm over the sample period is in panel E. More than half the sample 
issued no forecasts. Of those that did forecast, the mean (median) number of forecasts was 2.65 
(2) over the 4 year period. These data suggest considerable variation across firms in their tendency 
to issue forecasts. 


Il. FORECASTING BEHAVIOR AND THE LONG-RUN 
TENDENCY TO FINANCE EXTERNALLY 


This section examines whether forecasting behavior is related to the long-run tendency to 
access capital markets. In contrast to periods shortly before an offering, we assume that the 
litigation costs of increased disclosure are similar over the long run for firms that do and do not 
use external financing. Firms that plan to access capital markets are assumed to issue earnings 
forecasts to lower their cost of capital, but to time these forecasts to avoid periods when litigation 
costs are higher. Thus, because the benefits from forecasting for financing firms exceed those of 
non-financing firms, but the costs are similar, we hypothesize the following positive association 
(in alternative form): 


H,: The probability of a management forecast over the sample period (1980—1983)is greater 
for firms that finance externally during this period than for firms that do not. 


The association between forecasting and external financing could be due to firm size or 
industry, rather than our hypothesized behavior. The Wall Street Journal tends to follow larger 
firms that are more likely to finance externally’* and the cost of issuing forecasts may be greater 
for smaller firms. Regarding confounding industry effects, prior research suggests that differ- 
ences between utilities and non-utilities in financing and disclosure policies could be particularly 
problematic. Patell (1976) found that utilities issue more forecasts than non-utilities. For our 
sample, 76 percent of the utilities financed externally during the 1980—1983 period while only 46 
percent of the non-utilities did so, and the utilities generally financed externally more frequently.!” 
To control for industry differences and firm size, H, is tested by estimating the following probit 
model on utility and non-utility subsamples:!? 


pr(FORECAST) = a)+ a, LASSETS + a, EXTFIN + € (1) 


PS Standard and Poor Earnings Forecaster forecasts were used when IBES forecasts were unavailable. 

t6 Sce Atiase (1985), Cox (1985), Freeman (1987), Shores (1990) and O'Brien and Bhushan (1990). 

" Also, the nature of the information asymmetry associated with external financing by utilities may differ from other firms 
because they are closely regulated and finance externally more frequently. Masulis and Korwar' s (1986) study of equity 
offerings is consistent with this prediction. They report average announcement returns for industrial firms and public 
utilities of -3.25% and -0.66%, respectively. 

S The results for the fall-sample estimation lead to the same conclusions. 
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where: 
FORECAST = 1 if the firm issued a forecast over the sample period, 
and 0 otherwise 
LASSETS = log of average year-end assets for 1980-1983, and 
EXTFIN = 1 if the firm engaged in external financing from 1980—1983 


and 0 otherwise 


€ = a normally distributed error term 


il 


For the dependent variable in this regression, we classify forecasts after an offering the same 
as those before an offering.’ We feel this classification is a reasonable one for examining long- 
run disclosure behavior because many of the sample firms that finance have multiple financing 
transactions, so a forecast after one offering also precedes the next offering.” 

The probit results, reported in table 2, confirm a significant association between the 
probability of a forecast and the log of total assets, with a t-statistic of 10.59 (1.91) fornon-utilities 
(utilities). They also illustrate that the probability of a forecast is 28 percent higher (£24.25) for 
non-utilities, all else equal, if external financing also occurs over the sample period. This same 
figure for utilities is a statistically insignificant increase of 11 percent (120.453).?! In addition to 
controlling for size and industry, we also estimated (1) including a measure of growth, and found 
that the results are robust to this control.” The results from these regressions are consistent with 
a positive long-term relation between the tendency to make earnings forecasts and the tendency 
to use external financing. 


IV. FORECASTING BEHAVIOR SHORTLY BEFORE 
FINANCING EXTERNALLY 


We next examine the timing of forecasts by comparing a firm's forecasting behavior shortly 
before external financing to other non-event periods. In contrast to our long-run hypothesis, H,, 
we are unable to sign the net benefit of additional disclosure for periods shortly before offerings. 
The litigation and market price effects of additional disclosure act as countervailing forces during 
these periods, albeit differently for firms with favorable and unfavorable news. For good-news 
(bad-news) firms, the additional litigation risk shortly before an external financing transaction 
decreases (increases) the incentive to disclose additional information.” The effect on price is just 
the opposite; good-news (bad-news) firms havé an incentive to disclose additional (less) 
information to increase (avoid decreasing) the offering proceeds. Because we are unable to sign 
the net benefit of additional disclosure, our second hypothesis is non-directional:” 


P Of the 493 firms that financed and forecasted, 394 had at least one forecast prior to an offering, as compared to 99 firms 
with forecasts only after an offering. Because our forecast data cover a fixed period, forecasts before an offering in 1980 
are less likely than forecasts before an offering in 1983. 

* The presumption that firms that finance externally expect to do so again seems plausible. We find that 48.596 of the firms 
that financed at least once in the 1980—83 period also financed at least once during 1984—85 and they were approximately 
three times more likely to finance externally in 1984-85 than firms that did not finance previously. This association is 
significant at the .0001 level 77=200.8, one degree of freedom). 

21 We also estimated a polytomous probit model to assess the sensitivity of our results to the 0—1 specification of the 
dependent variable. The results are qualitatively the same as those reported. 

Z Growth was measured as the percentage change in total assets over the 1980-83 period. 

B Skinner (1992) finds that firms are more likely to release earnings forecasts and related disclosures prior to extreme 
negative quarterly earnings changes and interprets this as evidence that firms increase disclosure of unfavorable 
information to avoid lawsuits by investors. 

^ Our argument for two-tailed short-run tests would also seem to apply to our long-run tests. Implicitly, we are making 
three assumptions that differentiate the long and short-term disclosure environments. First, we assume litigation 


(continued on page 142) 
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TABLE 2 
Probit Estimation of the Relation between Forecasting and External 
Financing over the Sample Period 
pr (FORECAST) = q+ a, LASSETS + a, EXTFIN +€ (1) 
Likelihood 
Sample -O a, ca,  MaddalaR^ Ratio x n 
Non-utilities coefficient -1.319 0.192 0.276 0.097 174.5 1717 
t-statistic (-12.857) . (10.586) (4.252) 
Utilities coefficient -1.502 0.148 0.111 0.026 4.2 163 


t-statistic (-2.639) (1.905) (0.453) 


The above table shows the estimation results of the probit model (1) for the non-utilities and utilities samples. The variable 
definitions are: 


FORECAST = | if the firm issued a forecast over the sample period, 
and 0 otherwise 
LASSETS = log of average year-end assets for 1980-1983, 
EXTFIN = 1 if the firm engaged in external financing from 1980-1983 
and 0 otherwise 
€ = a normally distributed error term 


The R? shown above is the Maddala R?, calculated as 1-(L,/L,,)™ where L, is the likelihood function maximized with 
respect to all the parameters and L, is the function maximized only with respect to a. (Maddala; 1983, p. 39) The 
Likelihood Ratio x? statistics are distributed x? with 2 degrees of freedom, and have a probability value of less than 0.0001 
for the non-utilties sample and 0.30 for the utilities sample. 


H,: The probability of forecasting shortly before an offering is not equal to the probability 
of forecasting during other times for firms that finance externally. 


Figure 1 provides descriptive evidence on hypotheses 1 and 2 by illustrating the event-time 
relation between forecasting and financing for the sample of firms that financed externally during 
the sample period. The sample is aligned in event-time, with the external financing transaction 
occurring in month 0. For comparison purposes, we also plot the base rate of forecasting for non- 
financing firms.” The 48 sample months (January 1980-December 1983) for each financing firm 
were labeled based on their proximity in time to the closest external financing transaction for that 
firm. For example, if a firm only issued financing in months 38 and 41, month 37 would be coded 
as occurring one month before a financing month. A forecast occurring in month 39 would be 


Footnote 24 (continued from page 141) 
concerns are more pronounced shortly before external financing events. Second, we assume litigation costs depend more 
onsignals from information systems than on the systems themselves and, in particular, on signals released close to events 
that gives rise to the litigation. Third, we assurne firms time disclosures to minimize litigation exposure near external 
financing events. 

?5'This rate is calculated as the total number of forecasts issued by non-financing firms divided by the number of firm- 
months for this subsample. 
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coded as two months before a financing month and one month after a financing month.” To 
maximize use of available data, each event month containing a forecast observation is used and 
each financing event is used. For each event month (e.g., two months before financing), the 
forecasting frequency is defined as the total number of forecasts in months coded this way (e.g., 
including month 39 in the example) divided by the total months coded this way. The ratio is used 
to control for the effects of the sample window and multiple financing firms. The event window 
includes the 76-month period from event-month -41 to +34; beyond these months the number of 
possible event-months is less than 100. 

Figure | suggests that the greater tendency of financing firms to forecast relative to non- 
financing firms is not due to an increase in forecasting in the period immediately before or after 
financing, and thus that H, is not rejected. Rather, consistent with the regression results in the 
previous section, the data suggest a greater tendency of financing firms to forecast throughout the 
event period.” 

To formally test H,, we divide the 48-month sample period into firm-months that are “shortly 
before” and “not shortly before” an offering. The length of the event windows shortly before 
offerings is an arbitrary design choice. Ideally, the window would be close enough to the offering 
so that disclosure in the event window could affect the information available at the offering, but 
not so close as to overlap with the “quiet period” that follows the start of the registration period 
and extends to 45 days after the offering.” For shelf registrations, however, there is no quiet 
period, so the event window should end on the offering date. Unfortunately we do not know which 
of our offerings are shelf registrations and we do not have registration dates for about 20 percent 
of our sample.” For this reason, we arbitrarily end our short windows on the last day of the 
calendar month before the offering. We have also confirmed that our conclusions are robust to 
this choice by repeating our tests with event intervals ending one month earlier. Neither theory 
nor our understanding of the institutional setting suggests when managers might begin to disclose 
information to influence a specific offering. For the reported results, we define a firm-month as 
shortly before an offering if it is nine months or less before the offering month.” We use equation 


(2) to test H,; 


pr(FM) = B,+B, LASSETS +B, SBXF + BMONTHI..4B, MONTHI! (2) 
+ Bi, FYO + Bis FY1 + Bs FY2 +€ 


?5 Note that this forecast is not “double-counted” because the forecasting rate is calculated relative to possible event- 
months. 

21 Specifically, throughout the 76 month event period, the monthly forecast rate of financing firms is less than that of 
nonfinancing firms only 11 times. Using a binomial test, the hypothesis that the proportion of months in which the 
forecasting rate is greater for financing firms is 0.5 is rejected at less than 0.001 (26.08). This hypothesis is also rejected 
for the period before financing (z=5.00) and the period after financing (z=3.26). This result is not driven by firms with 
multiple financings. A plot of the data for firms with only one financing transaction, based on fewer observations, 
exhibits the pattern in figure 1. 

% The registration period begins no later than the time the company reaches an agreement with abner onmin 
an offering, and concludes when the offering is complete and the aftermarket prospectus delivery period has ended. The 
aftermarket prospectus delivery period for seasoned offerings ends 40 days after the later of the effective date of the 
registration statement or the commencement of the public offering. 

2 For the remaining observations, the median number of days between the registration and offering dates is 18 (15) days 
for the equity (debt) offerings. The corresponding standard deviations are 23 (10) days, reflecting the shelf registrations. 

9 Because this 9-month choice is ad hoc, we also repeated our tests for 3- and 6-month windows. None of our conclusions 
differ for these alternatives. 
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where: 
FM = 1 if the firm issued a forecast that month, 

and 0 otherwise, 

LASSETS = log of average year-end assets for 1980-1983, 

SBXF = 1 if the month is “shortly before" an external financing transaction, 

and 0 otherwise, 

MONTHn = 1 if the fiscal year month is n, and O otherwise, 

FYn = 1 if the fiscal year is 198n, and 0 otherwise. 


The dependent variable equals one when a firm releases a management forecast in a given 
fiscal-year month, and equals zero otherwise. Thus, each firm provides 48 firm-month observa- 
tions, one per sample-period month. Two sets of controls are included to reduce the problem of 
omitted variables that are correlated with the dependent and independent variables, and to 
decrease the standard errors of the parameter estimates. The first adjusts for firms' tendencies to 
issue forecasts at various points in the fiscal year.?! Eleven dummy variables, Month1-Month11, 
are included for fiscal months 1 through 11, respectively. The degree of information asymmetry 
between management and shareholders may vary during the fiscal year. For example, there may 
be less information asymmetry when financial statements are released. If so, and if firms tend to 
finance externally at this time, then they do not have to issue forecasts to reduce information 
asymmetries.?? Accordingly, these fiscal-month control variables may be correlated with both the 
dependent and the experimental variables. The second set of variables, FYO-FY2, controls for 
differences in forecast probability by sample year.” 

Only firms that financed externally are included in these estimations to prevent the observed 
long-run relationship between forecast occurrence and offering occurrence from driving the 
short-window results. In other words, we test whether firms that finance externally time their 
disclosures in relation to external financing transactions. 

The estimation results reported in table 3 indicate that non-utilities are not significantly more 
likely to issue forecasts before offerings than at other times (10.481 for the B, estimate), and that 
utilities are significantly less likely to issue forecasts before offerings (12—1.77). The results also 
indicate that forecasts are more likely for larger firms (£13.17), and more likely to occur in certain 
fiscal months than others, although interpretation of the fiscal month pattern is not obvious. 
Overall, the explanatory power of the model is weak, reflecting the difficulty of explaining the 
specific month in which a forecast will take place. While this may raise a question as to the power 
ofthe test to detect an increase in forecasting shortly before financing, it should also be noted that 
the results of this model are consistent with the evidence in figure 1 which suggest no increased 
tendency to forecast shortly before an offering. 

One interpretation of these results is that the benefit of a lower cost of capital derived from 
reducing information asymmetries before offerings is offset, on average, by the increased costs 
of legal liability in the non-utilities subsample. In the case of utilities, where the regulatory process 
acts to decrease management's private information, expected costs of legal liability exceed the 
benefits of forecasting. In particular, regulators are accustomed to adjusting the rate-base for 
utilities’ cost of capital, but they are less likely to include the legal costs associated with a 


31 McNichols (1989) reports a range for her sample from 25 forecasts in fiscal month 1 to 118 forecasts in fiscal month 
12, with considerable variation over months 2 through 11. 

32 Korajezyk et al. (1991) provide evidence that firms are more likely to finance after annual earnings are disclosed. 

3 Choe et al. (1993) suggest that frictions created by private information may vary over time because the pool of available 
investment opportunities changes with the business cycle. 
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Intercept 
SBXF 


Control Variables 
LASSETS 
Fiscal-Month 1 
Fiscal-Month 2 
Fiscal-Month 3 
Fiscal-Month 4 
Fiscal-Month 5 
Fiscal-Month 6 
Fiscal-Month 7 
Fiscal-Month 8 
Fiscal-Month 9 
Fiscal-Month 10 
Fiscal-Month 11 
Fiscal-Year 0 
Fiscal- Year 1 
Fiscal- Year 2 


Likelihood Ratio 77 
Maddala R? 


Number of observations 


TABLE 3 
The Timing of Forecasts Relative to External Financing Transactions 


pr(FM) = BorBi LASSETS +B. SBXF + Bs MONTH 1 ...+B,3 MONTH 11 


+ Big FYO + Bis FY1+Big FY2+e 


Dependent Variable: Forecast in 9 months before Offering 


Non-Utilities 
Coefficient t-statistic 
-2.446 -37.588 
0.013 0.481 
0.927 13.173 
-0.320 -4.701 
0.063 1.118 
-0.125 -2.051 
0.221 4.140 
0.230 4.281 
-0.147 -2.374 
-0.007 -0.113 
-0.349 -5.035 
-0.307 -4.541 
0.126 2.260 
-0.190 -2.988 
0.032 0.897 
-0.042 192 
-0.006 -0.161 
424.28 

0.010 
42,192 


-3.031 
-0.157 


0.133 


Utilities 


Coefficient 


Q) 


Above are the parameter estimates of probit model (2), with forecast occurrence as the dependent variable and occurrence 
of external financing transactions within 9 months (SBXF) and the log of average year-end total assets from 1980-1983 
(LASSETS) as independent variables. The dependent variable for the model is 1 (0) if a forecast occurred (did not occur) 
in that sample period month. The fiscal-month 1 to fiscal-month 11 variables are 1 when month tis that month of the fiscal 
year, and 0 otherwise. Fiscal year 0 (1,2) is a dummy variable coded 1 in 1980 (1981, 1982), and 0 in the other ycars. 
The Maddala R? is defined in table 3. The Likelihood Ratio x? is significant at less than the 0.0001 level of significance 


for the non-utilities sample, and at the 0.005 level for the utilities sample. 


shareholder suit. The descriptive evidence in figure 1 and the regression results in tables 2 and 3 
suggest that Ruland et al. (1990) find that forecasting firms tend to finance more frequently during 
the three months following a forecast (than firms that do not forecast) because financing firms 
forecast more in general, not because they increase their level of forecasting at this time. 
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V. THE CHARACTER OF FORECASTS SHORTLY BEFORE 
EXTERNAL FINANCING EVENTS 


Our third research question examines whether the character of forecasts changes shortly 
before an offering. If litigation concerns are more important than cost of capital considerations 
shortly before an offering, bad-news (good-news) firms will be more (less) likely to release 
earnings forecasts, and conversely, if litigation concerns are less important, good-news firms will 
be more likely to release earnings forecasts. To assess the relative importance of these competing 
incentives, we test whether management’s earnings forecasts are systematically different from 
prior analyst forecasts in either direction. 


H, Forecast deviations in the period shortly before an offering are not equal to forecast 
deviations in the period outside the offering period for firms that finance externally. 


We also test whether managers attempt to artificially boost share prices before an offering 
by examining whether forecasts are optimistic relative to the earnings per share realized after the 
offering. The greater potential for legal sanctions in the pre-offering period serves as an opposing 
force, and to the extent that this effect dominates the benefits of a higher offering price, 
management forecasts will not be optimistic relative to earnings realizations or to existing 
analysts’ forecasts. 


H: Forecasts in the period shortly before an offering are unbiased estimates of the earnings 
per share realized after the offering. 


We test these hypotheses using measures of management forecast deviations and errors, 
respectively. The results are presented for the sample as a whole.” 

The forecast deviation is measured as management’s earnings per share forecast less the 
median of analysts’ earnings per share forecasts for the month ending before the release of the 
management forecast, all divided by the share price ten days prior to the release of the 
management forecast.» Earnings per share are measured before extraordinary items and discon- 
tinued operations. The forecast deviation reflects whether management’s earnings forecast is 
more or less favorable than existing expectations. 

We measure a management forecast error as realized earnings per share less management’s 
forecast, all divided by the preannouncement share price. If management forecasts prior to 
offerings are strategically optimistic relative to management’s information, the average forecast 
deviation will be positive and the average forecast error will be negative. In contrast, if firms with 
positive private information issue unbiased forecasts, the average forecast deviation will be 
positive but the average forecast error will be insignificantly different from zero.* 

Descriptive statistics for the forecast deviations and forecast errors are reported.in panel A 
of table 4 for forecasts released outside the nine months before an offering. Neither the average 
forecast deviation, nor the average forecast error is significantly different from zero. Descriptive 
statistics for the forecast deviations and errors are reported in panel B for forecasts released in the 
nine months before an offering. The median forecast deviation before an offering is weakly 


M Sensitivity analysis indicates that our conclusions are unaffected by a utilities/non utilities grouping. 
5We have also conducted these tests on percent forecast deviations and errors, with similar results, 
* The average forecast error might also be zero if management can manipulate earnings to correspond to its forecast. 
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TABLE 4 
. Analysis of Forecast Deviations and Forecast Errors 


Panel A. le recasts outside the 9 month window 
Standard 
N Mean Median 96 positive Deviation 
Forecast deviations! 439 -0.0009 -0.0012 44.87% 0.0775 
Forecast errors? 494 -0.0064 0.0001 50.20% 0.1407 
Panel B. Forecasts occurring less than 9 months before an offering 
Standard 
N Mean Median % positive Deviation 
Forecast deviations! 69 -0.0166 -0.0012 44,93% 0.1164 
Forecast errors? 73 -0.0110 -0.0018 41.10% 0.0504 


Panel C. Wilcoxon tests of differences in the forecasts in the 9 months before offerings relative 


to forecasts at other times. 
Two-tailed 
Wilcoxon Probability 
z-statistic Value 
Forecast deviations! -0.53 0.60 
Forecast errors? -1.11 0.27 


! Forecast deviations are measured as management’ s EPS forecast less the median IBES forecast of most closely 
preceding the forecast release date, scaled by the firm's share price 10 days prior to management's forecast release. 
? Forecast errors are measured as EPS before extraordinary items and discontinued operations ment’s forecast 

of EPS, scaled by the firm's share price 10 days prior to management’ s forecast release. 
3 The total number of forecast errors in this table exceeds the ftumber-of.numeric-forecasts in table 1 because the 
requirement that asset data be available was not imposed for table 8. 


negative, but it is not significantly different from forecasts issued outside the pre-offering 
period.” We are therefore unable to reject H,,. This suggests that cost of capital considerations 
and sampling bias effects do not dominate legal liability considerations. We are also unable to 


1 This finding is corroborated by the stock price reaction to forecasts issued within nine months before offerings, which 
were insiginificantly different from zero for firms issuing equity and weakly negative for firms issuing debt. 
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reject the null hypothesis that the mean ex post forecast error is different from zero.* These data 
are consistent with the notion that existing anti-fraud miles and reputational concerns are 
sufficient to deter biased forecasts in the pre-offering period. 


VL SUMMARY AND CONCLUSIONS 


This paper examines the relation between external financing transactions and management’ s 
tendency to issue qualitative or quantitative forecasts of annual earnings. Our central finding is 
that firms are significantly more likely to forecast if they access capital markets over the sample 
period, but that these financing firms do not change their forecasting behavior in the period 
surrounding an offering. This evidence suggests that management issues forecasts to communi- 
cate with investors, and that management views disclosure as valuation-relevant. However, this 
tendency toward greater disclosure by financing firms is not directed exclusively toward 
influencing investors’ short-term earnings expectations at the time an offering is made. The 
evidence is consistent with potentially greater benefits of disclosure in the pre-offering period 
being offset by greater disclosure costs, such as increased expose to legal liability under the 
1933 Securities Act. 

We also compare the level of management forecasts made shortly before an offering to prior 
analyst expectations and subsequently realized earnings. The average forecast deviations and 
errors in the offering pericd are insignificantly different from zero, and insignificantly different 
from forecast deviations and errors at other times in the sample period. These results suggest that 
management disclosure is not motivated solely by the desire to reveal favorable information. 
Other factors, such as legal liability or reputation considerations motivate firms to disclose 
information that is unfavorable. 

Taken as a whole, our results suggest that managers release earnings forecasts over the 
sample period to influence capital market participants. The aim of these forecasts may be to attract 
attention to the firm, as in Merton’s (1987) awareness model, to develop a reputation for reliable 
disclosure, in the spirit of arguments made by Diamond (1985), King et al. (1990), and Healy and 
Palepu (1993), to avoid legal liability, or, for some firms, to disclose favorable news. The fact that 
financing firms have greater incentives to voluntarily disclose information than non-financing 
firms suggests that market forces provide some incentives for more disclosure. We consider a 
better understanding of these forces an important issue for future research. 


38 Similar results were reported for the full sample of numeric forecasts in the 1979-83 time period in McNichols (1989). 
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I. INTRODUCTION 


The Financial Accounting Standards Board (FASB) is currently deliberating issues related 
to hybrid security reporting. The current reporting model under the FASB's conceptual frame- 
work classifies sources of capital as either debt or equity. The usefulness of this reporting 
approach depends both on differences between classifications and on the similarity of the 
securities summarized within classifications. To the extent that securities reported within 
classifications have similar economic substance, such classifications will exhibit representational 
faithfulness with respect to the phenomena they purport to describe (FASB 1980, SFAC 2, para. 
63). 

The existing dichotomous classification approach may not adequately report the economic 
substance of hybrid securities - those with both debt and equity characteristics. Concern over this 
potential limitation in the reporting model has increased in light of the increased issuance of 
hybrid securities in recent years (Woods and Bullen 1989). In addition, difficulties in distinguish- 
ing the debt and equity features of hybrid securities has important implications for implementa- 
tion of the FASB's fundamental financial instruments approach for the recognition and measure- 
ment of financial instruments (FASB 1991, Barth et al. 1993). In response to these concerns, the 
FASB has issued a discussion memorandum that considers alternative classification approaches 
for hybrid securities (FASB 1990). Retaining the current model 1s one of the alternatives as is the 
addition of a “quasi-equity” financial statement element for securities with both debt and equity 
characteristics. 

In this paper we examine a particular hybrid security, redeemable preferred stock, to provide 
evidence that is potentially useful in assessing classification as a means of conveying information 
about hybrid securities. Redeemable preferred stock (RPFD hereafter) is representative of hybrid 
securities; RPFD have a mandatory redemption provision (a debt characteristic) while at the same 
time they retain basic characteristics of equity (such as an inability to force a firm into bankruptcy 
for delinquency of dividend or redemption payments). We focus on RPFD for a number of 
reasons. First, it is a commonly issued hybrid security.! Second, resolution of RPFD reporting 
was specifically cited in the discussion memorandum on hybrid securities. This emphasis is due 
in part to current inconsistent authoritative guidance between generally accepted accounting 
principles and SEC regulations.? Finally, while there is theoretical debate as to the appropriate 
reporting approach for RPFD, there is little empirical evidence on how the various debt and equity 
characteristics of RPFD affect its economic substance.? 

One indicator of the economic substance of a security is its impact on the issuing firm's 
systematic risk. This study provides empirical evidence concerning the stock market perception 
of the economic substance of RPFD based on the relation between firm leverage and systematic 
risk. The tests are conducted on a sample of 239 firms with RPFD outstanding between 1979 and 
1989 and examine variation in this relation conditioned on the magnitude of RPFD as an element 


! For example, the sample in this paper is drawn from over 400 RPFD issues from the 1979-1989 time period. 

? Securities and Exchange Commission (SEC) registrants are precluded from classifying RPFD as equity in their balance 
sheets (SEC 1979). The SEC does not, however, require that RPFD be classified as debt. As a consequence, most firms 
report RPFD in a “mezzanine” section on the balance sheet between debt and equity (Nair et al. 1990). This treatment 
has no support within the existing conceptual framework. 

3 While Nair et aL (1990) apply the FASB conceptual framework definitions of debt and equity to conclude that RPFD 
should be classified as debt, the results in Kimmel and Warfield (1993) suggest that RPFD exhibit considerable 
heterogencity in features which make it difficult to apply these definitions. Sellers et al. (1992) examine the market 
perception of RPFD for a small sample of 24 RPFD issues; they do not examine the implications of RPFD attributes 
for the market perception of these hybrid securities. 
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of firms’ capital structures. If investors believe that the redemption requirements of RPFD make 
it similar to debt, then the observed relation between RPFD and systematic risk should be similar 
to that between debt and systematic risk. Our evidence suggests that, despite mandatory 
_redemption payments, RPFD does not have a debt-like impact on systematic risk—a finding we 
attribute to legal features that allow RPFD issuers to defer dividend and redemption payments in 
some contexts (Manning and Hanks 1990). We also provide evidence that the market perception 
of a hybrid security is conditioned on security attributes. The findings indicate that convertible 
RPFD or RPFD that have voting rights equivalent to those of common stock have an effect on the 
market assessment of systematic risk similar to equity. However, RPFD without these equity 
characteristics does not exhibit a relation with systematic risk that is similar to that of debt or 
ity. 
d To the extent that a security’s impact on systematic risk is an important criteria for evaluating 
the economic substance of a security, these results have implications for assessing the usefulness 
of classification as a means of reporting hybrid securities. Specifically, the findings in this study 
indicate that RPFD does not consistently exhibit an impact on the market's assessment of 
systematic risk similar to either debt or equity. Since the usefulness of classifications is reduced 
if the items within a category are not similar in their economic substance, dichotomous 
classification of hybrid securities may lack representational faithfulness to the economic 
substance of these securities, as measured by their effects on systematic risk. More generally, the 
evidence questions the merit of classification within financial statements as a means of conveying 
information about hybrid securities. Since variation in RPFD attributes often results in securities 
with diverse economic substance, it may be difficult for the FASB to develop a comprehensive 
classification rule for hybrid securities, including theuse of quasi-equity, without also developing 
provisions for adequate disclosure of important security attributes. 


IL. BACKGROUND AND MOTIVATION 


The usefulness of financial reporting depends in part on the representational faithfulness of 

the reporting treatment to the economic substance of the underlying phenomenon (FASB 1980, 
SFAC 2, para. 63). Classifications of securities within an issuer's financial statements are useful 
only if the securities summarized within the classification have similar economic substance. 
Innovations in financial instruments have created challenges to classification as a means to report 
useful information. Related, but distinct, probléms arise from two types of securities: compound 
securities and hybrid securities. Compound securities, such as convertible bonds, represent a 
combination of two separably identifiable types of security. Hybrid securities, such as redeemable 
preferred stock (RPFD) are nondivisible securities, having both debt and equity characteristics.‘ 
' A recent FASB discussion memorandum [FASB 1991] addresses the problems created by 
compound securities by proposing the fundamental financial instruments approach. This ap- 
proach decomposes compound securities into fundamental financial instruments, (e.g., a convert- 
ible bond into a straight bond and an equity option), that are then classified within the current 
dichotomous reporting model as either debt or equity. While this approach may eventually prove 
useful for compound securities, the discussion memorandum notes that general implementation 
of the fundamental financial instruments approach will not be possible until reporting problems 
associated with hybrid securities are resolved. That is, if decomposition of a compound security 


* As documented in table 2, compound hybrids, such as convertible RPFD also are relatively common. 
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results in a hybrid security as a basic element, then that hybrid must be classified as either debt 
or equity. This may result in securities with differing economic substance summarized within the 
same classification, thus limiting the informativeness of the reports.° 

A related FASB discussion memorandum [FASB 1990] considers possible solutions to 
hybrid security classification. One approach would continue dichotomous classification after re- 
evaluation (and possibly redefinition) of the FASB definitions of debt and equity. The discussion 
memorandum also considers alternatives to dichotomous classification, such as creation of a third 
element in financial statements (quasi-equity) for securities with both debt and equity features.® 

Some prior research has used the current conceptual framework definitions, and focused on 
mandatory redemption features to argue that RPFD be classified as debt within the current 
dichotomous model (Nairet al. 1990). However, the economic substance of these and other hybrid 
securities is arguably a function of other features that are not easily accommodated in the current 
model. For example, legal features of RPFD (indeed, all preferred stock) preclude payments to 
holders of such securities when the payments threaten the solvency of the firm.’ Thus, the 
conflicting implications of these features suggest that a simple characterization of RPFD as either 
debt or equity based on the conceptual framework definitions is unlikely to result in useful 
classifications. If conflicting features are relevant to useful classification of hybrid securities, then 
the quasi-equity approach may be more useful than dichotomous classification for reporting 
hybrid securities. 

The tests described below provide evidence on the economic substance of these securities 
based on an analysis of the association between the market’s assessment of systematic risk and 
operating risk conditional on the inclusion of RPFD in the capital structure. Additional analyses 
test for variation in the economic substance of these securities conditional on attributes such as 
voting rights and convertibility to common stock. Observing variation in the market perception 
of RPFD conditional on attributes such as voting rights or conversion privileges indicates the 
importance of security attributes for the market’s assessment of the economic substance of these 
securities.® 

While these tests may not be relied upon to definitively determine the appropriate classifi- 
cation model for hybrid securities like RPFD, they can provide important information regarding 
one indicator of RPFD’s economic substance - its relation with systematic risk. To the extent that 
RPFD’s relation with systematic risk differs from that of securities classified as either debt or 
equity, classification within a dichotomous framework can obscure important attributes of RPFD. 
Furthermore, evidence on security attributes and the consequent effect on the market perception 
of RPFD is important because these features increase the difficulty of deriving a comprehensive 
rule for RPFD classification that is consistent with each issue's economic substance, including 


> Barth et al. (1993) evaluate the ability of the fundamental financial instrument approach to value compound securities. 
While recognizing the importance of resolving the debt versus equity distinction for implementation of the fundamental 
financial instruments approach, they do not address it in their study. 

$ A similar approach was recently adopted by bank regulators to define bank capital for the calculation of the capital ratio 
(Board of Governors 1988). 

? While RPFD payments are an obligation of the firm under normal operating conditions, RPFD investors cannot force 
a delinquent firm into bankruptcy (Manning and Hanks 1990; FASB 1990). This is important because financial theory 
suggests that a primary characteristic of debt is that creditors have the option of forcing a delinquent debtor into 
bankruptcy (Myers 1977; Warner 1977). Recent research investigates the implications of bankruptcy costs for financial 
instrument innovations such as hybrid and compound securities (John 1993). 

? Conversion privileges imply some positive probability that the security will become equity. Voting rights are a 
fundamental feature of equity since they convey to holders control over firm investment decisions; such decisions 
impact cash flows and risk (Hart 1989). 
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use of the quasi-equity approach.’ The next section develops the theoretical framework for the 
empirical tests of the market perception of the economic substance of RPFD. 


Ill. RELATION BETWEEN MARKET RISK, DEBT AND RPFD 


To provide evidence on the economic substance of hybrid securities, this study examines the 
relation between RPFD (as an element of the capital structure) and systematic risk. Previous 
research demonstrates that alternative securities in firms' capital structures affect systematic risk. 
For example, Hamada (1972) and Bowman (1979) show that the use of debt financing increases 
a firm's systematic risk. Within the framework of the capital asset pricing model, firm i's 
systematic sensitivity to changes in the market rate of return is captured by the common stock beta 
(B; ), which is equal to: 


8, = Cov(R,,RM,)/ Var(RM,) (1) 


where R, and RM, are the returns in period t for firm i and the market portfolio, respectively. 

Since the value of the firm equals the sum of the value of common stock and other claims on 
firm assets (e.g., debt), the overall risk of the firm can be expressed as a weighted average of the 
betas of these claims: 


= (D/V) 8, + (E/V) B, (2) 


where 8, is the operating risk of a firm and D and E are the market values of the firm's debt and 
equity, respectively. B, is the beta for debt, and B, the beta for common stock. Note that D + E = 
V, where V equals the total value of the firm. 

Rearranging equation 2, the systematic risk of common stock can be expressed as a function | 
of the degree of operating risk and financial risk (leverage) incurred by the firm where the degree 
of leverage can be measured by the ratio of debt to equity:"! 


B, =B, + D/E(B,-B,) (3) 


Since B, > B, , equation (3) suggests that B, in equation (1) is positively related to the ratio 
of debtto equity. Equation 4 generalizes the relation in equation 3 to consider the effects of RPFD: 


B, =B, + D/E(B,-B,) + R/E(B,- B,) (4) 


where R= RPFD and B, is the beta for RPFD. Thus, to the extent that RPFD affects the financial 
risk of the firm similar to debt, then B = B, and (B, - B. ) » 0. Alternatively, if RPFD's impact on 
risk is similar to common stock, then (B, - B. ) will be less than zero. 


? The quasi-equity approach has similar limitations to those of the dichotomous model, only to a lesser degree; use of the 
quasi-equity approach may still summarize securities with quite different economic substance within the same 
classification. 

0This approach takes the perspective of investors, which is justified given the pre-eminence of this set of users within 
the FASB's conceptual framework. The methodology used here parallels that of Dhaliwal (1986). Dhaliwal tests 
whether unrecorded unfunded vested pension obligations are treated like debt by the capital markets. 

1! See Brealey and Myers (1984), chapter 17, for further discussion of this model. This formulation is similar to those in 
Hamada (1972) and Bowman (1979). The approach used here allows incorporation of alternative claims O debt) 
with potentially different risks. 
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Restating equation 4 to express B as a function of empirical measures of leverage and 
operating risk results in the primary empirical model for tests of the market perception of RPFD: 


Beta, = d,+ d, OP, + d, (OP, e (Debt,/MVE,,)) + d, (OP, * (RPFD/MVE,)) (5) 


where: Beta = an empirical estimate of the systematic risk of the firm's common stock, B, 
OP = measure of operating risk, B, , 
Debt/MVE = total debt divided by the market value of common stock plus the book value of 
perpetual preferred stock. 
RPFD/MVE = total redeemable preferred stock divided by the market value of common stock 
plus the book value of perpetual preferred stock.” 


Consistent with the theoretical relationship between leverage and systematic risk, positive 
and significant coefficients are expected for securities that increase financial risk. Therefore, if 
RPFD affects the market assessment of systematic risk in a manner similar to debt, we expect the 
estimated coefficients on Debt and RPFD to be equivalent (d, = d, > 0): Alternatively, an 
insignificant or non-positive coefficient estimate for RPFD (d.) is predicted if, conditional on the 
level of debt, RPFD is perceived by the market to have both debt and equity features. That is, while 
any debt attribute of RPFD is expected to increase the perceived financial risk, the presence of 
equity features may mitigate the risk-increasing effects of these securities. Additional tests 
examine variation in the relation between systematic risk and RPFD conditional on equity 
attributes such as voting rights, and whether the security is convertible. These models will be 
introduced as those tests are presented. 


IV. SAMPLE AND DATA 


The sample is drawn from firms with RPFD outstanding sometime during the 1979 - 1989 
time period on the 1989 Compustat Industrial, Full Coverage, and Tertiary file. Data on various 
debt and equity attributes of the RPFD issues were gathered from financial statements or Moody’s 
manuals. Table 1 describes the various data availability screens applied to the maximum set of 
479 firms. Missing financial statement data as well as inability to find details on RPFD attributes 
such as voting rights or convertibility eliminated 76 firms. Constraining tests to firms with 
December fiscal year-ends but without a change in fiscal year-end eliminated 79 and 48 firms 
respectively from the sample.? Lack of return data or other data for estimating market or 
operating risk measures reduced the sample by another 37 firms. These screens resulted in a final 
sample of 239 firms for which cross-section and time-series data are pooled for tests in the 1979 
to 1989 time period. 


Measurement of Variables 


Beta 


Systematic risk measures for the sample firms were estimated based on market model 
regressions of weekly firm returns on weekly value-weighted market portfolio returns for each 


2Tn this study we used accounting measures of debt and preferred stock. While there is evidence in some contexts that 
use of book values reduces the power of tests such as ours (Mulford 1985), book values are used due to unavailability 
of market value data. 

PConstraining to December year-end firms was done to facilitate estimation of accounting betas which requires 
accounting rates of return measured over the same period. 
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TABLE 1 
Virus aed el Sette OF idan bebween syst amddic rade and REED 


Firms on 1989 Compustat with RPFD outstanding in 1979-1989. 479 
Unable to find details on RPFD terms in financial statements 

or Moody's manuals or other missing financial statement data. (76) 
Change in fiscal year-end during period. (79) 
Non - December fiscal year-ends. (48) 
Insufficient return data for estimating systematic risk, 

accounting betas, or other missing data. (37) 
Firms used in tests. 239 





fiscal year. Firm/years with less than 30 weekly returns for estimation were dropped from the 
tests. !4 


Operating Risk (OP) 


Priorresearch has estimated operating risk in a number of ways. Litzenberger and Rao d97 2) 
used a measure based on the variability of earnings while Dhaliwal (1986) used an "accounting 
beta" estimated from a regression of firm earnings yields (income before interest and taxes 
divided by assets) on market earnings yields. Bowman (1979) suggests that these alternative 
measures can serve equally well as proxies for operating risk. Our primary results are based on 
the accounting beta proxy. 


Capital Structure Variables 


. Debtand RPFD are measured based on data from the Compustat file as of the end of the fiscal 
year. Debt is measured as total liabilities (without RPFD) at fiscal year-end divided by the market 
value of common equity plus perpetual preferred stock. Since firms receive a tax shield from use 
of debt in their capital structures, the debt variable is multiplied by (1-Tax Rate) where Tax Rate 
is estimated for each firm/year based on the ratio of income tax expense divided by pretax 
income.!é RPFD is total redeemable preferred stock divided by the market value of common 
equity plus perpetual preferred stock at fiscal year-end.” 


M Use of weekly returns to estimate systematic risk allows sufficient degrees of freedom in the regressions (up to 52 
observations) while avoiding potential bias in estimated betas for small firms due to non-synchronous trading for daily 
returns, and potential non-stationarity in beta when using monthly retarns over a number of years. 

15 At least ten years, and up to 20 years of data were used to estimate the accounting betas. To mitigate the impact of the 
skewed nature of the distribution, logs were taken (after a constant was added to the estimates to eliminate negative 
values). We repeated the analyses 1) without this transformation, and 2) using earnings variability as a proxy for 
operating risk. The inferences in this study were unaffected by usc of these alternative operating risk measures. 

I5] ike Dhaliwal (1986), the analyses were repeated without a tax rate adjustment; no change in the inferences was 
observed. 


“Including perpetual preferred stock in tbe deflator implies the assumption that these securities have the same risk as 
common stock. 
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TABLE 2 
Descriptive statistics for the full sample and industry sub-samples * 


SIC NUM BETA OP DEBT RPFD VOTE , CONV 
l 9 1.16 2.61 5.30 1.39 33.3 44.4. 
2 22 1.24 2.21 3.02 1.19 27.3 27.2 
3 32 1.11 2.16 4.42 1.28 21.9 28.1 
4 24 1.03 2.14 4.90 1.31 33.3 25.0 
5 10 1.25 2.15 14.20 1.44 40.0 30.0 
6 23 1.05 2.10 21.00 1.23 21.7 39.1 
7 1 1.62 2.34 5.90 1.25 0.0 0.0 
1-7 121 1.13 2.19 8.29 1.28 27.2 30.6 
4.9 118 0.50 2.04 3.02 1.11 21.2 1;7 
All 

Firms 239 0.82 2,12 5.69 1.20 24.3 16.3 


* Statistics based on by-firm means across all years of data for 239 firms with RPFD outstanding during 1979-1989. SIC 
is the one-digit SIC code except for utilities - due to the size of the utility group (4.9), they were grouped separately. 
SIC Codes: 

1 Agriculture, Extractive, Construction 

2 Food, Paper, Printing, Chemicals 

3 Manufacturing 

4 (Except for 4900) Transportation, Freight, Communication 

49 Electric, Gas Utilities 

5 Wholesale, Retail 

6 Financial Services 

7 Other Services 
NUM is the number of firm means represented in each row. Beta (dependent variable) is estimated from a one factor 
market model based on weekly returns over the fiscal year. OP (proxy for operating risk) is estimated from a regression 
of firm accounting returns (income before interest and taxes scaled by assets) on a market index of accounting returns 
(the natural log of OP is used to mitigate the potential impact of the skewed nature of the distribution for this variable). 
Debt equals total debt divided by the market value of equity plus perpetual preferred stock at fiscal year end. RPFD 
is redeemable preferred stock divided by the market value of equity plus perpetual preferred stock at fiscal year end. 
VOTE is the percentage of RPFD with voting rights similar to common stock. CONV is the percentage of RPFD with 
conversion to common stock privileges. 


Descriptive Statistics 


Table 2 contains descriptive statistics for the sample firms. The average Beta for the sample 
(.82) reflects the large proportion of utilities represented in the sample (SIC=4.9). Due to 
regulation, utilities have lower levels of systematic risk. While utility firms have been long-time 
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issuers of RPFD, in recent years non-utility firms have also issued RPFD.'* On average, RPFD 
represents 120 percent of issuing firms’ market values of equity, although RPFD is on average 
a smaller element in utilities capital structures compared to the non-utilities (1.11 « 1.28; p<.01). 

In addition, many RPFD have attributes reflective of equity securities. For our sample firms, 
24.396 have voting rights similar to common stock, and 16.3% are convertible to common stock. 
There are also differences between utilities and non-utilities in the prevalence of these attributes. 
In both the case of voting rights and conversion privileges, the non-utility firms are more likely 
to have these features as compared to the utility sub-sample. Specifically, while 27.296 of the non- 
utility firms' RPFD have voting rights similar to common, 21.296 of the utilities do. In addition, 
virtually none of the utility issues (1.7%) have conversion privileges, while 30.6% of the non- 
utility firms’ RPFD are convertible.” 

Table 3 contains descriptive statistics based on pooling all firm/year observations that are 
used in the regression tests.” These statistics are quite consistent with the data reported in table 
2. The correlations reported in the footnote to table 3 provide preliminary evidence of the 
predicted positive association between systematic risk (Beta) and operating risk (OP) - (.361, 
p<.01) and Beta and leverage (OP *Debt) - (.232, p«.01).?! 


V. RESULTS 


Table 4 contains the results of tests of the market perception of RPFD. All models reported 
in the table regress the systematic risk measure (Beta) on an operating risk proxy (OP). Tests of 
the market perception of RPFD are based on models that include debt and RPFD as additional 
explanatory variables (models 3 and 4). The coefficient on the operating risk measure (a,) is 
positive and significant in each of the models as is the estimate on the effects of leverage for 
explaining systematic risk (a,). These results are consistent with the prior empirical literature 
concerning the determinants of systematic risk (e.g., Dhaliwal 1986) and support the theoretical 
predictions developed in section II. 

In table 4, the results from estimating equations 3 and. 4 provide preliminary evidence 
concerning the market perception of RPFD. In model 3, RPFD is included as a separate element 
of the capital structure. Inconsistent with the contention that RPFD is similar in economic 
substance to debt, the coefficient estimate for RPFD (a,) is negative and significant (p«.042). In 
model 4 of table 4, industry group dummies (one for each one-digit SIC code - see table 2) and 
year dummy variables were included in the model. The industry dummies are included to control 
for potential variation in systematic risk related to operating characteristics of the sample firms. 


18 See Kimmel and Warfield (1993) and Houston and Houston (1990) for a discussion of factors contributing to incidence 
of RPFD issuance over the sample period. Specifically, while utilities had traditionally been the most frequent issuers 
of RPFD prior to 1980, beginning i in 1980, capital expenditures by utilities declined dramatically, decreasing the need 
for RPFD financing. Second, in 1981 bank regulators adopted a definition of regulatory capital favorable to RPFD 
issuance. Third, merger activity increased significantly in the 1980s. Preferred stock (including RPFD) is often used 
both in mergers and as part of anti-takeover defenses. 

1 The difference in proportions for conversion privileges between the two groups is statistically significant at p<.01. 
While the difference in voting rights proportions is not statistically significant, note tbat the utility group percentage is 
less than all but onc of the other industry groups. 

?9] ogged values of the capital structure variables were used in the regressions to mitigate the effects of extreme values 
on the estimates. While generally weaker in statistical significance, results for the non-transformed data are qualitatively 
similar. Since scaling the capital structure variables by assets also reduces the effects of extremely low market values 
of equity in the denominator, we also repeated our analysis using asset-scaled variables (without logs——-see also Karels 
et al. (1989)). The results for these models are essentially the same as those reported in the tables. 

?1Tn addition, the magnitude of the correlation coefficients suggests collinearity among the independent variables is not 
likely to be a concern for the regression results reported in the next section. 
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TABLE 3 
Descriptive statistics for regression variables* 
Mean 
Variable (Median) Minimum Maximum 

Beta 0.7571 -0.638 3.863 
(0.6244) 

OP |. 210 1.030 2.980 
(2.06) 

Debt 1.198 0.125 4.947 
(1.038) 

RPFD 0.125 0.0001 2.039 
(0.072) 


t 


* Statistics based on pooling all firm/ycar data for 239 firms with RPFD outstanding during 1979-1989 (1,658 
observations). Beta (dependent variable) is estimated from a one factor market model based on weekly returns over the 
fiscal year. OP (proxy for operating risk) is estimated from a regression of firm accounting returns (income before 
interest and taxes scaled by assets) on a market index of accounting returns. Debt equals total debt divided by the market 
value of equity plus perpetual preferred stock at fiscal year end. RPFD is redeemable preferred stock divided by the 

market value of equity plus perpetual preferred stock at fiscal year end (natural logs of OP, Debt, and RPFD are used 
to mitigate the potential effects of extreme values on regression estimates). Pearson product moment correlations 
(Spearman correlations in (.) for regression variables - all significant at p<.01): 


Beta OP OP*Debt 
OP 361 
(401) 
OP*Debt 232 312 
(.108) (.207) 
OPRPFD . 126 303 463 
(-.072) (.092) (374) 


To the extent that the operating risk proxy (OP) does not completely control for this source of 
variation, the results for model 3 may be due to industry effects rather than the effects of RPFD 
on systematic risk. The data in table 2 indicate that utility firms (SIC—4.9) comprise nearly one- 
half of the sample. Due to their regulated status, we also expect these firms to exhibit lower 
systematic risk. The year dummy variables are incorporated to control for potential inter-temporal 
variation in the systematic and operating risk relation.” 

The results for model 4 in table 4 are consistent with the earlier reported results on the market 
perception of RPFD. Both the estimated coefficients for the OP and leverage variables remain 
positive and significant as predicted. Furthermore, the coefficient on the RPFD variable is 
negative and insignificant after controlling for industry and year effects. These results corroborate 


2 We also estimated a model with SIC control variables, in each year of the sample period. The mean across the yearly 
models for each coefficient was then computed. Under the assumption that each year is an independent panel, the t-test 
on the mean of the yearly coefficients is adjusted for the effects of times-series ccrrelation (Bernard 1987). These results 
are qualitatively similar to the pooled estimations. 
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TABLE 4 
Results of regressions explaining variation in systematic risk conditional on RPFD* 


. General Model:Beta, =a, 4X SIC,,,+2 YEAR,,,,+ a, OP, +a, (OP,* Debt, ) + 
a, (OP,* RPFD, ) + e, 


Adjusted: 
Model a, a, a, R? (%) 
l 0.98 13.0 
Operating Risk Only (.001) 
2 0.87 0.05 14.5 
Operating Risk and Debt (.001) (.001) 

3 0.89 0.06 -0.06 14.7 

Operating Risk, Debt and RPFD (.001) (.001) (.042) 
4 0.31 0.03 -0.03 40.0 

Industrv and Year Controls, Operating Risk, (.001) (.002) (.217) 


Debt, and RPFD 


* Coefficientestimates fromregressions of systematic risk on operating risk and leverage measures. All models are based 
on 1,658 observations. Two tail p values on the significance of the t-statistics are in (.). Beta (proxy for systematic risk) 
is estimated from a one factor market model based on weekly returns over the fiscal year. OP (proxy for operating risk) 
is estimated from a regression of firm accounting returns (income before interest and taxes scaled by assets) on a market 
index of accounting returns. Models 2 through 4 include additional explanatory variables that are interacted with OP. 
Debt equals total debt divided by the market value of equity plus perpetual preferred stock at fiscal year end. RPFD is 
redeemable preferred stock divided by the market value of equity plus perpetual preferred stock at fiscal year end 
(natural logs of OP, Debt, and RPFD are used to mitigate the potential effects of extreme values on regression estimates). 
Model 4 is based on model 3 but with dummy variables for industry membership (1-digit SIC codes—see table 2) and 
YEAR dummies for each year in the estimation period (1980-1989). 


the conclusion that RPFD does not have the same impact on the market’s assessment of risk as 
debt.” ?* 


Variation in the RPFD / Systematic Risk Relation 
RPFD Attributes 


The results reported in table 4 suggest that, on average, “tie market does not perceive RPFD 
as impacting risk in a fashion similar to debt or equity. These results are not surprising since these 


2 The importance of the year and industry controls is reflected in the increased R? between the models 3 and 4 in table 
4. In addition, the intercept estimate (not reported) goes from being strongly significant in models 1 through 3 to being 
insignificant upon inclusion of the year and industry controls. Thus, in subsequent tests, we focus on the expanded 
specification. 

JTo the extent that a firm has risky debt and the level of this debt is correlated with RPFD issuance, the less positive 
relation documented for RPFD in table 4 may be due to this omitted factor. Thus, the models also were estimated after 
including alternative variables to control for this potential misspecification. We interacted the debt variable with a 
dummy variable for the extreme levels of debt (within industry), under the assumption the highly levered firms are more 
likely to have risky debt. We also specified the dummy variable based on operating performance (extreme low return 
on assets within industry) to proxy for the effects of risky debt on the systematic risk relation. Finally, we dropped all 
Observations when the book value of equity was negative. The results reported in table 4 were robust to these various 
controls for risky debt. 
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securities have both debt and equity features (mandatory redemption payments and the ability to 
avoid payments in some contexts, respectively). In addition, RPFD frequently are structured with 
other features that might affect their economic substance. For example, the table 2 data reveal that 
24 percent of RPFD in this sample have voting rights similar to common stock and 16 percent have 
aconversion feature. To the extent these features affect the market perception of these securities, 
then we expect a different relationship between Beta and RPFD for issues with equity-like 
characteristics. 

The results in panel A of table 5 provide support for this conclusion. The models in panel A 
interact a dummy variable (Attribute) with the RPFD variable. The Attribute variable takes a 
value of one if an RPFD issue has voting rights similar to common (model A2), is convertible 
(model A3), is neither a voting issue nor is convertible (model A4). The results for models A2 and 
A3 in table 5 indicate that RPFD with these equity features exhibit a relation with Beta that is 
similarto what would be expected of equity rather than of debt. The coefficients on the interaction 
variables for voting and conversion features (VOTE and CONV - (a,)) are negative and signifi- 
cant in these models suggesting RPFD with these features have a more equity-like impact on risk 
than other RPFD. The results for model A4 in table 5 further support the importance of RPFD 
attributes for the market perception of these securities. In this model, the attribute coefficient (a,) 
reflects variation in the a, coefficient conditional on no equity features (non-voting and not 
convertible). To the extent that these RPFD are most “debt-like,” a positive coefficient estimate 
is expected for a,. 

The results for model A4 in table 5 confirm this expectation. The coefficient on RPFD (a,) 
is negative and significant, consistent with the earlier reported results that, on average, the impact 
on systematic risk is most similar to that of equity, after controlling for variation in security 
attributes. However, the a, coefficient is positive and significant suggesting that RPFD without 
equity features exhibit an association with systematic risk more similar to debt compared to those 
securities with equity features. While these RPFD appear more similar to debt than other RPFD, 
the sum of the a, and a, coefficient is indistinguishable from zero (.19+ (-.17) = .02, p>.20). These 
results are inconsistent with even these debt-like RPFD having the same impact on systematic risk 
as debt. 


Utility / Non-Utility Analysis 

To provide further insight into the features of RPFD that may affect the market's perception 
of these securities, the table 5 analysis was also conducted on utility (SIC=4.9) and non-utility 
firms. Examination of the utility/non-utility partition is motivated across a number of dimensions. 
First, the utilities represent a large industry subset of the overall sample (nearly one-half of the 
sample firms). In addition, only two of the utility issues have a conversion feature and the 
incidence of voting rights is lower for the utility group than virtually all the other groups. Finally, 
utility firms, on average, have less RPFD in their capital structures (see table 2). 

In panel B of table 5, a utility dummy variable is interacted with the capital structure and 
operating risk variables to assess variation in the earlier results conditional on utility industry 
membership. Consistent with the earlier results, the OP and leverage coefficients have the 
predicted signs and are significant. While there is some evidence of variation in the effects of debt 
and operating risk on Beta, conditional on utility industry membership (a, and a, >0, p<.05, .11 
respectively), we focus on the results for RPFD. There is evidence of a marked difference between 
the market perception of RPFD issued by utility and non-utility firms. 

The results for model B1 for the non-utility firms are consistent with those reported in table 
4 and panel A of table 5. The coefficient estimate for RPFD (a, =(-.06), p«.04) is consistent with 
the relation with perceived systematic risk, on average, being similar to that of equity for the non- 
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utility firms. In contrast, results for the utility sub-sample provide little evidence that RPFD has 
a relation with risk similar to that of either debt or equity for utility firms. The incremental effect 
of RPFD for utility firms in model B1 is positive but insignificant (a, + a, =.07, p>.20 ). The results 
for model B2 further corroborate the relevance of equity attributes for the market perception of 
RPFD in the case of the non-utility sub-sample, but not for the utility group. The results for model 
B2 in panel B of table 5 for the non-utility group are consistent with the overall results reported 
for model A4 in panel A of table 5. On average, RPFD exhibit an inverse relation with systematic 
risk ( a,= (-.19), p«.001) while RPFD with no equity features are not perceived as debt or equity 
(a, + a=.004, p>.20 ). For the utility sub-sample, there is little evidence that RPFD has an effect 
on risk similar to debt, either in an average sense or when focusing on RPFD that are most debt- 
like. While the average effect is positive, the sum of the coefficients is not significant (a,ta=.11, 
p>.20). Furthermore, controlling for security attributes does not significantly improve the 
specification of the RPFD / Beta relation for the utility firms (a, + a, =-.05, p>.20 ). 

In summary, the findings reported in table 5 indicate that the market perception of RPFD is 
conditioned on attributes that predictably affect the economic substance of these securities and 
varies according to industry membership. The earlier evidence that RPFD with certain equity 
attributes affect market assessed systematic risk in a manner similar to equity is limited to the sub- 
sample of non-utility firms; there is little indication that utility investors treat these RPFD similar 
to equity when assessing risk. This finding is not unexpected since utility issues have lower 
incidence of equity features.? 


VI. DISCUSSION AND CONCLUSION 


The combination of a mandatory redemption feature and the ability to avoid dividend and 
redemption payments in times of distress make RPFD a hybrid security. This study provides 
empirical evidence on the market perception of the economic substance of these securities by 
investigating the relation between RPFD and systematic risk. The analysis reveals that, in general 
(and despite mandatory redemption features), RPFD does not exhibit a debt-like relation with 
systematic risk. Further examination reveals, however, that the relation differs depending upon 
RPFD attributes. Specifically, RPFD that are convertible to common stock or have voting rights 
impact the market assessment of systematic risk in a manner similar to equity. These results 
suggest that the ability to omit redemption payments during periods of financial distress 
combined with control and conversion features result in some RPFD that exhibit a relation with 
systematic risk similar to that of equity, while the relation for RPFD without these characteristics 
does not resemble that of either debt or equity. 

These results have implications for assessing the merits of classification approaches being 
considered by the FASB for RPFD and other hybrid securities like RPFD. To the extent that 
securities reported within a classification have similar economic substance, such classifications 
will be useful in accurately communicating information about relative claims to firm resources. 
One indicator of a security's economic substance is its relation with market assessed systematic 


75 Additional analyses were conducted to examine the robustness of the reported results to variations in model 
specification, effects of variable measurement, and potential violations of regression assumptions. For example, the 
results are robust to various truncation rules on the independent variables and to controls for heteroskedasticity. In 
addition, the analyses were repeated, with no effect on the main inferences, after deleting observations identified as 
“influential” using the regression diagnostics of Belsley et al. (1980). Again, the earlier reported inferences concerning 
the market perception of RPFD are unchanged by use of these alternative estimation techniques. 
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risk. Our findings suggest that the relation of RPFD with risk is not consistent with these securities 
being perceived as similar to debt or equity, thus reducing the potential usefulness of the 
dichotomous classification model. More generally, our evidence questions the merit of classifi- 
cation within financial statements as a means of conveying information about hybrid securities. 
Given variation in RPFD attributes (which result in diverse securities), it will be difficult for the 
FASB to develop a comprehensive classification rule for hybrid securities (including the use of 
quasi-equity) that reflects the economic substance of these securities. 

The fundamental financial instruments approach may adequately address the reporting 
problems associated with some compound securities, such as convertible bonds; that is, the option 
created by the conversion feature and the bond are fundamental elements, which can be classified 
as either debtor equity. However, other compound securities, like convertible RPFD, have a non- 
divisible hybrid security fundamental element, for which the economic substance differs from 
that of straight debt or equity. In addition, incorporation of features such as voting rights also 
affects the economic substance of these securities and questions the usefulness of even the quasi- 
equity approach for reporting hybrid securities. In conclusion, while the results of this study 
cannot be used to definitively determine whether the FASB should adopt a particular (e.g., 
dichotomous, trichotomous) classification approach, the results do suggestthat classifications are 
not likely to communicate important information relevant to hybrid security valuation. Impor- 
tantly, information on these RPFD attributes generally is not disclosed in financial statements.” 
Thus, our results suggest that the classification model adopted for hybrid securities should also 
contain provisions for disclosure of important security attributes." 


?5 Most of the security attribute data for this paper were obtained from Moody's manuals because they were not available 
in financial statements. 

V The entity model (Paton 1922) is a disclosure-oriented approach to security classification that is being considered by 
the FASB (FASB 1990). This approach abandons classification and instead lists all sources of capital in order of 
seniority (combined with disclosure of key security features). Its general implications for hybrid securities are discussed 
in Clark (1993). For specific discussion with respect to RPFD, see Kimmel and Warfield (1993). 
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North Carolina 


Vice Chairperson: Karen McCarron, Georgia 
State University 

Secretary-Treasurer: Robert A. Nehmer, Georgia 
State University 


i Elect: Penny Wardlow, Governmen- 
tal Accounting Standards Board 
Secretary: Florence Sharp, Ohio University 


Secretary: Edward R. Shoenthal, CUNY 
Brooklyn College 

Treasurer: O. Finley Graves, University of 
Mississippi 


Secretary: Martha Eining, University of Utah 
Treasurer: Steve Sutton, Arizona State 
University-West Campus 


Past President: Michael À. Robinson, Baylor 
University 

Secretary-Treasurer: Hadley Schaefer, University 
of Florida 


Chairperson Elect: Jessie Dillard, University of 
New Mexico 
-Treasurer: Anthony G. Puxty, 
University of Strathclyde” 
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Teaching and Curriculum Section 


Chairperson: Richard E. Baker, Northern Illinois Secretary: Sharon L. Kimmell, University of 
University n 

Vice Chairperson (Academic): Kent St. Pierre, Treasurer: David E. Stout, Villanova University 
University of Delaware 

Vice Chairperson (Practice): George Krull, Grant 
Thornton 


Two-Year College Section 


Chairperson: Ellen Sweatt, De Kalb College Coordinator of Regional Representatives/Officer 
(North) at Large: Linda Lessing, SUNY Farmingdale 
Secretary/Editor: Leonard Long, Fisher College 
Vice n: Robert C. Maloney, 
University of Alaska—Anchorage 


ACCOUNTING ACCREDITATION COMMITTEE 


Charge: To monitor issues and developments in accreditation that may have an impact on accounting 
programs—specifically: 
1. To follow the accreditation activities of the AACSB as well as other developments in accreditation. 
2. To inform the AAA Executive Committee of accreditation issues as they arise and recommend 
policies or positions as appropriate. 


Chairperson: John T. Ahern, DePaul University Thomas P. Howard, University of Alabama 


Raymond C. Dockweiler, University of John Ribezzo, Community College of Rhode 
Missouri-Columbia Island 
Jack C. Gray, Michigan State University Richard E. Flaherty, Arizona State University 


James A. Heintz, University of Connecticut 


ACCOUNTING EDUCATION ADVISORY COMMITTEE 
(Standing Committee of the Association) 


Charge: To serve as the co-ordinating committee for Association activities that involve accounting 

education—-specifically ' 

1. To develop and issue position statements on matters relating to accounting education. 

2. To develop and recommend to the Executive Committee educational policy for the Association. 

3. To initiate, co-ordinate, and administer the activities of the Association in the field of accounting 
education, including accounting education research and continuing professional education. 

4. To provide effective accounting education liaison with the Accounting Education Change Commis- 
sion, regions, and sections. 

5. To co-ordinate the work of committees for the Doctoral Consortium, New Faculty Consortium, 
Professional Examinations, Trueblood Seminars, and other education-designated committees. 


Chairperson: Jan R. Williams, University of Lynn M. Paluska, Nassau Community College 
Tennessee Jamie Pratt, Indiana University 

Robert Glen Berryman, University of Minnesota Kevin D. Stocks, Brigham Young University 

Richard E. Flaherty, Arizona State University Shyam Sunder, Carnegie Mellon University 

Karen L. Hooks, University of South Florida Michael A. Diamond, University of Southern 

Hugh A. Hoyt, Miami University Califomia 


Albert R. Mitchell, James Madison University 


BY-LAWS COMMITTEE 
(Standing Committee of the Association) 


Charge: To review the Association’s By-Laws and recommend revisions, as appropriate, to the 
Executive Committee. 


Chairperson: Charles G. Carpenter, Miami Jay M. Smith, Jr., Brigham Young University 
University Mary S. Stone, University of Alabama 
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CARTER SCHOLARSHIPS COMMITTEE 


Charge: To recommend recipients for the Arthur H. Carter Foundation Student Scholarships— 
specifically: 
I. To collect and evaluate applications for the Carter Foundation Scholarships. 
2. To rank the applications received. 
3. To transmit the rankings to the trustees of the Foundation. 
4. To develop a plan for continuation of funding, in concert with the Executive Committee, once the 
endowment is extinguished. 


Chairperson: John Cumming, Miami University - Karen L. Hooks, University of South Florida 
Rajendra P. Srivastava, University of Kansas 


COMPETITIVE MANUSCRIPT AWARD COMMITTEE 


Charge: To select one, two, or (at most) three recipient(s) of the Competitive Manuscript Award, using 
criteria approved by the Executive Committee. 


See Abbie J. Smith, University of John R. M. Hand, University of North Carolina 
Chicago David A. Ziebart, University of Illinois 
A. Rashad Abdel-khalik, University of Florida Mark A. Wolfson, Stanford University 


Stanley Baiman, University of Pennsylvania 
Sarah E. Bonner, University of Southern 
California 


CORPORATE ACCOUNTING POLICY SEMINARS COMMITTEE 


Charge: To plan and hold the Corporate Accounting Policy Seminar in accordance with established AAA 
policies—specifically: 
1. To plan for and arrange to have the 1995 seminar conducted and administered. 
2. To report on the results and effectiveness of the seminar and to recommend any changes to the 
Executive Committee by November 1, 1995. 


Chairperson: Alison Hubbard Ashton, Duke Joseph Martin, IBM 

University Dan Sansone, Vulcan Materials, Inc. 
Philip D. Ameen, General Electric Company John K. Simmons, University of Florida 
Lon Amett, Bethlehem Steel Robert N. Freeman, University of Texas at 
William J. Ihlanfeldt, Shell Cil Company Austin 


DELOITTE & TOUCHE WILDMAN MEDAL AWARD COMMITTEE 


Charge: To administer the John R. Wildman Medal award program in accordance with the terms of the 
grant from the Deloitte & Touche Foundation. 


Chairperson: David F. Larcker, University of Ronald Dye, Northwestern University 
Pennsylvania Paul A. Griffin, University of California-Davis 
Gary Biddle, University of Washington William F. Messier, Jr., , University of Florida. —. 


Mark Chain, Deloitte & Touche 
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DOCTORAL CONSORTIUM COMMITTEE 


Charge: To plan and hold the 1995 Doctoral Consortium in accordance with established AAA policies— 
specifically: 
1. To plan for and arrange to have the 1995 consortium conducted and administered. 
2. To report on the results and effectiveness of the consortium and to recommend any changes to the 
Executive Committee by November 1, 1995. 


Chairperson: Julie H. Collins, University of Jennifer Francis, University of Chicago 
North Carolina Ronald R. King, Washington University 
Daniel W. Collins, University of Iowa Grace Pownall, Emory University 


Joel S. Demski, University of Florida 


DOCTORAL FELLOWSHIPS COMMITTEE 


Charge: To select the 1994—1995 recipients of the American Accounting Association doctoral fellow- 
ship grants using the criteria established by the Executive Committee— specifically: 
1. To select the grant recipients. 
2. To seek financial support, in concert with the Executive Committee, for the doctoral fellowship 
program. 


irperson: Paul Danos, University of Michigan Gerald L. Salamon, Indiana University 
David C. Burghstahler, University of Washington Michael A. Diamond, University of Southern 
Mark M. Chain, Deloitte & Touche California 
James Don Edwards, University of Georgia 


EDUCATIONAL MATERIALS COMMITTEE 
(New Committee for 1994—1995) 


Charge: To identify and assess financially feasible options for the production and/or distribution of 
nonperiodical educational materials under Association auspices and to report back to the Executive 
Committee no later than August 1, 1995. 


Chairperson: Gary John Previts, Case Western Yuji Ijiri, Carnegie Mellon University 
Reserve University W. Morley Lemon, University of Waterloo 
Elba F. (Bud) Baskin, Arthur Andersen & Co. Thomas Robinson, University of Miami 


John W. Dickhaut, University of Minnesota-T win Clinton E. White, Jr., University of Delaware 
Cities 


FINANCE COMMITTEE 
(Standing Committee of the Association) 


Charge: To develop financial goals and strategies for the Association— specifically: 
1. To evaluate additional or alternative ways to finance existing Association activities. 
2. To plan financial strategies for Association initiatives and activities. 
3. To work with the 1994—1995 President-Elect in preparing the 1995-1996 budget. 
4. To recommend how AAA funds should be invested in accordance with established Executive 


Committee policies and procedures. 


Chairperson: Mary S. Stone, University of Bernard J. Milano, KPMG Peat Marwick 
Alabama Paul L. Gerhardt, Executive Director, American 
W. Steve Albrecht, Bri Young University Accounting Association 
Eugene E. Comiskey, gia Institute of Katherine Schipper, AAA President-Elect, 
* Technolo University of Chicago 
Linda Mills , Northern Kentucky 


University 
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FINANCIAL ACCOUNTING STANDARDS COMMITTEE 


Charge: To co-ordinate all Association activities with respect to financial accounting standard-setting— 
specifically: 

1. To evaluate selected discussion memoranda and exposure drafts related to financial accounting and 
reporting in the private sector as they are released by the FASB and to respond to the FASB in writing and 
by appearing at selected public hearings. 

2. To be cognizant of emerging issues related to financial accounting and reporting in the private sector 
through materials issued by the Emerging Issues Task Force and other relevant sources with evaluation, 
where appropriate, of the significance and potential implications of such issues. 

3. To meet with the FASB, normally on annual basis. 

4. To explore the publication of selected responses by the committee in Horizons or other appropriate 
outlets. 

5. To consider and promote ways to increase the academic input to the standard-setting process. 

6. To provide input to the Financial Reporting Issues Conference Committee on the appropriate content 
and structure of the 1995 conference. 


Gud Mary E. Barth, Harvard University Stephen Penman, University of California- 

John Gribble, Coopers & Lybrand . erkeley 

Daniel W. Collins, University of Iowa John Smith, Deloitte & Touche 

John A. Elliot, Comell University Ray G. Stephens, Kent State University 

Wayne R. Landsman, University of North Terry D. Warfield, University of Wisconsin- 
Carolina Madison 


COMMITTEE ON FINANCIAL COMMUNICATIONS 


Charge: To evaluate the Association’s financial reporting practices (financial statements, notes and other 
related data) and suggest improvements where needed. 


Chairperson: Eugene F. Comiskey, Georgia Linda Mills Marquis, Northern Kentucky 
Institute Tech University 
Rhoda Icerman, Florida State University Mary S. Stone, University of Alabama 


William T. Wrege, Ball State University 


FINANCIAL REPORTING ISSUES CONFERENCE COMMITTEE 


Charge: To plan and hold the 1995 Financial Reporting Issues Conference in accordance with established 
AAA policies—specifically: 
l. To seek the advice of the Financial Accounting Standards Committee and the SEC Liaison 
Committee on the content and structure of the 1995 conference. 
2. To arrange to have the conference conducted and administered. 
3. To report on the results and effectiveness of the conference and to recommend any changes to the 
Executive Committee. 


Chairperson: Krishna G. Palepu, Harvard Robert W. Holthausen, University of 
University Pennsylvania 

Victor L. Bernard, University of Michigan Ross Watts, University of Rochester 

Robert H. Herz, Coopers & Lybrand Baruch Lev, University of California-Berkeley 


INNOVATION IN ACCOUNTING EDUCATION AWARD COMMITTEE 


Charge: To select the recipient(s) of the Outstanding Teaching/Curriculum Development Award, using 
the criteria approved by the Executive Committee. 


Chairperson: Russell M. Barefield, University of Terry S. Lindenberg, Rock Valley College 

ia Robert W. Rouse, College of Charleston 
Bala G. Dharan, Rice University Bradley J. Schwieger, St. Cloud State University 
Brent C. Inman, Coopers & Lybrand Kevin D. Stocks, Brigham Young University 
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INTERNATIONAL ACCOUNTING RESEARCH CONFERENCE COMMITTEE 


Charge: To plan and hold an International Accounting Research Conference in accordance with 
established AAA policies—specifically: 
1. To arrange to have the conference conducted and administered. 
2. To report on the results and effectiveness of the conference and to recommend any changes to the 
Executive Committee within four months after the conference is held. 


Chairperson: Grace Pownall, Emory University Mark Lang, University of North Carolina at 
Frederick D. S. Choi, New York University Chapel Hill 
Gary K. Meek, Oklahoma State University 


INTERNATIONAL FACULTY EXCHANGE COMMITTEE 


Charge: To oversee the faculty exchange program and assist Association staff in the administration of 
the program—specifically: 
1. To select the U.S./Canadian professors to be involved in the program for 1996 tour and to select the 
schools in the United States and Canada for the 1995 lecture tour. 
2. To select the area for the 1997 visit. 
3. To assist the Executive Director with announcements, arrangements, and follow-up evaluations for 
the respective visits. 


Chairperson: Sidney Gray, University of Chen-en Ko, National Taiwan University 
Warwick Konrad W. Kubin, Virginia Polytechnic Institute 
Adolf J. H. Enthoven, University of Texas at & State University 
Dallas Stephen A. Zeff, Rice University 
Yukio Fujita, Aichi Gakuin University Paul Gerhardt, Executive Director, American 
V. Bruce Irvine, University of Saskatchwan Accounting Association 


LITIGATION ADVISORY COMMITTEE 


Charge: To monitor and provide counsel concerning the Litigation Database project and to explore other 
litigation issues, as appropriate. 
M cines William R. Kinney, Jr., University Richard W. Leftwich, University of Chicago 

of Texas at Austin Norman R. Walker, Price Waterhouse 


Andrew D. Bailey, Jr., University of Arizona Victor L. Bernard, University of Michigan 
John W. Hill, Indi ana University 


MEMBERSHIP SERVICES COMMITTEE 
(New Committee for 1994-1995) 


Charge: To evaluate the benefits of Association membership and to determine the services members 
would like the Association to provide—specifically: 
1. To develop, distribute, and evaluate a membership questionnaire to assess members’ views on the 
Association and how the association might best serve members’ needs. 
2. To develop follow-up procedures for determining why members leave the Association. 
3. To develop a strategy for addressing Association membership trends. 
4. To exchange information with the Membership and Subscriptions Committee. 


e Donald L. Madden, University of Gary H. Siegel, DePaul University 

Richard E. Baker, Norther Illinois University D E N eae 

fucque eline B. Sanders, Mercer County James E. Sorensen, University of Denver 
mmunity College -— Paul Gerhardt, Executive Director, American 

Joseph J. Schultz, Jr., Arizona State University Accounting Association 


Larry P. Scott, Price Waterhouse 
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MEMBERSHIP AND SUBSCRIPTIONS COMMITTEE 


Charge: To work with the Association staff and other AAA committees to define and enhance the benefits 


of membership—specifically: 


1. To exchange information with the Membership Services Committee and the Relations with Two- 


Year College Faculty Committee. 


2. To work with the headquarter’s staff in upgrading the Association’s database of accounting 


educators. 


Chairperson: Joseph Kreutle, Miami Dade 
Community College 

Chairperson: Earl Kay Stice, Rice University 

Penne Ainsworth, Kansas State University 

Robert D. Allen, University of Utah 

Sam Asamoah, Jefferson Community College 


M. Edgar Barrett, American Graduate School of 


International Management 

George Battistel, Portland State University 

Joseph V. Carcello, University of Tennessee 

Kevin C. W. Chen, Rutgers University-New 
Brunswick 

Francie Criner, University of Maine-University 
College 

Elizabeth B. Davis, Baylor University 

Leslie G. Eldenburg, University of Arizona 

Neil Fargher, University of Oregon 

Stephen L. Fogg, Temple University 

Allen Ford, University of Kansas 

David R. Fordham, James Madison University 

Hubert D. Glover, Clemson University 

O. Finley Graves, University of Mississippi 

David A. Guenther, University of Connecticut 

Rama R. Guttikonda, Alabama State University 

Steven C. Hall, Washington State University 

Cynthia Halloway, Tarrant Community College 

les Harter, University of Alaska-Fairbanks 

Anita S. Hollander, Florida State University 

Van E. Johnson, Northern Illinois University 

Paul M. Kazenski, University of Hawaii at 
Manoa 

Leon Korte, University of South Dakota 

Lucille E. Lammers, Illinois State University 

Lucy L. Larson, St. John's University 

Amy H. L. Lau, Oklahoma State University 

Max A. Laudeman, Indiana University-Purdue 
University 

Timothy M. Lindquist, University of Northern 
Iowa 


Marlys Gascho Lipe, University of Colorado at 
Boulder 

Joseph Lloyd-Jones, University of Ottawa 

Julie A. Lockhart, Western Washington 
University 

Mary J. Loyland, University of North Dakota 

Joan L. Luft, Michigan State University 

P. Merle Maddocks, University of Alabama in 
Huntsville 

William Markell, University of Delaware 

Ann Martin, University of Colorado at Denver 

Marc Massoud, Claremont McKenna College 

Lorraine McClenny, North Carolina State 
University 

Kate Mooney, St. Cloud State University 

Charles W. Mulford, Georgia Institute of 
Technology 

Robert A. Nehmer, Georgia State University 

Don Pagach, North Carolina State University 

Timothy A. Pearson, West Virginia University 

Russell J. Petersen, University of Akron 

Barbara G. Pierce, Florida Atlantic University 

Norma C. Powell, University of Massachusetts at 
Lowell 

Georgia C. Saemann, University of Wisconsin- 
Milwaukee 

Robert H. S. Sarikas, Boise State University 

James A. Schweikart, University of Richmond 

Keith F. Sellers, University of Arkansas 

Jeffrey F. Shields, University of Baltimore 

Katherine J. Silvester, Rensselaer Polytechnic 
Institute 

Richard H. Simpson, University of Massachusetts 

Paul J. Steinbart, Saint Louis University 

Randal H. Stitts, Sul Ross State University 

John M. Strefeler, University of Nevada, Reno 

Barbara G. Taylor, Montana State University 

Robert Werner, Rutgers University 

Anne Wu, National Chengchi University 


MINORITY FACULTY DEVELOPMENT COMMITTEE 


Charge: To co-ordinate Association activities with respect to minority recruitment and development— 


specifically: 


1. To explore ways in which potential minority candidates can be encouraged to apply to accounting 
programs and pursue professional careers in accounting education. 

2. To explore ways in which to encourage the development of minority faculty members. 

3. To examine and monitor minority recruitment programs of other professional organizations, such as 


the AICPA. 


4. To develop a strategic plan for the American Accounting Association in the area of minority faculty 


development. 
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Chairperson: Willie E. Gist, University of 
Oklahoma 
Barney R. Cargile, University of Alabama 


Saundra T. Drumming, Florida A&M University 
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James C. Texas A&M University 
Theresa A. Hammond, Boston College 

Jack E. Kiger, University of Tennessee 

T. Sterling Wetzel, Oklahoma State University 


NEW FACULTY CONSORTIUM COMMITTEE 


Charge: To plan and hold the 1995 New Faculty Consortium in accordance with established AAA 


policies—specifically: 


1. To arrange to bave the consortium conducted and administered. 
2. To report on the results and effectiveness of the consortium and to recommend any changes to the 


Executive Committee by March 1, 1995. 


Chairperson: Jane F. Mutchler, Pennsylvania 
State University 

Linda S. Bamber, University of Georgia 

Dennis R. Reigle, Arthur Andersen & Co. 


Jamie Pratt, Indiana University 
Judy D. Rayburn, University of Minnesota-Twin 
Cities 


NOMINATIONS COMMITTEE 
(Standing Committee of the Association) 


Charge: 'To select a list of nominees to AAA offices for election by the membership in August 1995. 


Chairperson: Arthur R. Wyatt, University of 
Illinois 

Andrew D. Bailey, Jr., University of Arizona 

Quiester Craig, North Carolina A&T State 
University 

John A. Elliot, Cornell University 


Zoe-Vonna Palmrose, University of Southern 
California 

Gary L. Sundem, University of Washington 

Lawrence A. Tomassini, The Ohio State 
University 


NOTABLE CONTRIBUTIONS TO ACCOUNTING LITERATURE AWARD 
SCREENING COMMITTEE 


Charge: To identify works of exceptional merit from published accounting books and articles that meet 
the guidelines for the “Notable Contributions" awards, as established by the Executive Committee. 


Chairperson: Arnold M. Wright, Boston College 

Stephen K. Asare, University of Florida 

Robert H. Ashton, Duke University 

Bala V. Balachandran, Northwestern University 

Robert M. Bushman, University of Chicago 

Andrew A. Christie, University of Rochester 

David M. Cottrell, Bri Young University 

Somnath Das, University of California-Berkeley 

Haim Falk, Rutgers University-Campus at 
Camden 


James R. Frederickson, Indiana University 

James F. Gaertner, University of Texas at San 
Antonio 

D. Eric Hirst, University of Texas at Austin 

Ross G. Jennings, University of Texas at Austin 

Marilyn F. Johnson, University of Michigan > 

Steven J. Kachelmeier, University of Texas at 
Austin 

Paul D. Kimmel, University of Wisconsin- 
Milwaukee 


Lisa L. Koonce, University of Texas at Austin 

C. Jevons Lee, Tulane Untversity 

Charles M. C. Lee, University of Michigan 

Maureen F. McNichols, Stanford University 

H. Fred Mittelstaedt, University of Notre Dame 

Dale C. Morse, University of Oregon 

Belinda Mucklow, University of Wisconsin- 
Madison 

Daniel P. Murphy, University of Tennessee 

Mark W. Nelson, Cornell University 

Buck K. W. Pei, Arizona State University 

Glenn M. Pfeiffer, San Diego State University 

Krishnamoorthy Ramesh, Northwestern 
University 

Joshua Ronen, New York University 

Richard C. Sansing, Yale University 

Jeffrey W. Schatzberg, University of Arizona 

Brian Shapiro, University of Arizona 

Wayne H. Shaw, University of Colorado at 
Boulder 
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Toshi Shibano, University of California-Berkeley 

Keith A. Shriver, Arizona State University 

Konduru Sivaramakrishnan, Carnegie Mellon 
University 

James D. Stice, Brigham Young University 

Amy P. Sweeney, ard University 

Walter Teets, University of Illinois 

Jacob K. Thomas, Columbia University 
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Jimmy Yang-Tzong Tsay, National Taiwan 
University 

Samuel S. Tung, University of Otago 

David E. Wallin, Ohio State University 

Thomas R. Weirich, Central Michigan University 

Robert N. West, University of Virginia 

James A. Yardley, Virginia Polytechnic Institute 
& State University 


NOTABLE CONTRIBUTIONS TO ACCOUNTING LITERATURE AWARD SELEC- 
TION COMMITTEE (Joint with the AICPA) 


Charge: To select the recipient(s) of the award from those books and articles identified by the Screening 
Committee as potentially notable contributions to accounting literature. 


Chairperson: Richard A. Lambert, Stanford 
University 

Robert Libby, Cornell University 

Ella Mae Matsumura, University of Wisconsin- 
Madison 


Terrence J. Shevlin, University of Washington 
Baruch Lev, University of California-Berkeley 


OUTSTANDING ACCOUNTING EDUCATOR AWARD COMMITTEE 


Charge: To select one or (at most) two recipient(s) of the Outstanding Accounting Educator Award, using 


criteria approved by the Executive Committee. 


Chairperson: William W. Holder, University of 
Southern California 

G. Michael Crooch, Arthur Andersen & Co. 

Roland E. Dukes, University of Washington 


Corine T. Norgaard, SUNY at Binghamton 
Sarah A. Reed, Texas A&M University 
Albert R. Mitchell, James Madison University 


PROFESSIONAL EXAMINATIONS COMMITTEE 


Charge: To monitor activities of the various professional accounting examination bodies and evaluate 
the implications of their respective examinations on accounting education. 


Chairperson: O. Ray Whittington, San Diego 
State University 

James D. Blum, American Institute of Certified 
Public Accountants 

Priscilla Payne, Institute of Management 
Accountants 

"Thomas Powell, Institute of Internal Auditors 


Gerald Smith, University of Northern Iowa 
Larry M. Walther, University of Texas at 


Arlington 
D. Dewey Ward, Michigan State University 
Robert Glen Berryman, University of Minnesota- 
Twin Cities 


PROFESSIONAL PRACTICE ISSUES COMMITTEE 


Charge: To plan and hold one or more roundtable discussions on current issues or problems in 
professional practice and publish the results—specifically: 
1. To arrange to have the roundtable discussion(s) held and administered. 
2. To report on the results and effectiveness of the discussions and to recommend any changes to the 


Executive Committee by August 1, 1995. 
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pean William L. Felix, Jr., University of 


oye Arce Michigan State University 
R. K. McCabe, California State University- 
Fullerton 
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Arthur R. Li de University of Illinois 
W. Brace Jo n, University of Iowa 


PROFESSIONALISM AND ETHICS COMMITTEE 


Charge: To foster ethics education and scholarship in the accounting profession-specifically: 

1. To arrange to have the 1994—95 Professionalism and Ethics Seminar conducted and administered. 

2. To report on the results and effectiveness of the seminar and to recommend changes by July 1, 1995. 

3. To prepare the papers and discussions of the 1994-95 seminar for publication, in accordance with 
a previously approved plan and established AAA policies. 

4. To propose and, if approved, organize concurrent sessions at the national meeting and regional 
meetings to report on the result of the 1994—95 seminar. 

5. To propose and, if approved, organize and conduct CPE sessions at the national meeting and regional 


meetings, as warranted. 


6. Toreport to the Executive Committee on how the integration of ethics into the accounting curriculum 


can be fostered over time. 


7. To review the need to continue activities as a committee to ensure that an Association vehicle 


continues to function to address ethical issues. 


8. To act as a resource to the Association Staff, Executive Committee and Council, regions, sections, 
and other groups on ethics in the accounting profession. 


Chairperson: Robert G. Ruland, Suffolk Univer- 
sity 

Mary S. Doucet, Bowling Green State University 

Gary L. Fish, Illinois State University 

Sharon L. Green, Duquesne University 


Cindy L. Moeckel, Arizona State University 
Lawrence A. Ponemon, SUNY at Binghamton 
Michael K. Shaub, University of Nebraska 

Paul F. Williams, North Carolina State University 
Hugh A. Hoyt, Miami University 


PROGRAM ADVISORY COMMITTEE 


Charge: 'To assist the President and the Executive Director in developing the technical program for the 


1995 Annual Meeting in Orlando. 


Chairperson: Wanda A. Wallace, College of 
William & M 


Karen L. Hooks University of South Florida 
Daniel E. O'Leary, University of Southern 
California 


Kimberly J. Smith, College of William & Mary 
Shyam Sunder, Carnegie Mellon University 

Earl R. Wilson, University of Missouri-Columbia 
Jean Wyer, Coopers & Lybrand 


.. PUBLICATIONS COMMITTEE 
(Standing Committee of the Association) 


Charge: To monitor and co-ordinate all aspects of the Association's publications program—specifically: 
1. To report periodically to the President and Executive Committee on all matters of concern in the field 


of publications. 


2. To recommend candidates for editor or editor-elect positions as appropriate. 
3. To recommend AAA publications to the Executive Committee as deemed desirable. 


Chairperson: Daniel L. Jensen, The Ohio State 
University 

Robert K. Eskew, Purdue University 

Susan F. Haka, Michigan State University 

Larry E. Rittenberg, University of Wisconsin- 
Madison 


Robert S. Roussey, University of Southern 
California 

Wilfred C. Uecker, Rice University 

William S. Waller, University of Arizona 

Frederick L. Neumann, University of Illinois 
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RELATIONS WITH TWO-YEAR COLLEGE FACULTY COMMITTEE 


Charge: To help integrate two-year college educational contributions with four-year and graduate 
colleges-specifically: 
1. To recommend measures to the Executive Committee of the American Accounting Ássociation that 
will serve the AA A's two-year college members. 
2. To help promote the membership of two-year college faculty in the American Accounting 
Association. 


Chairperson: Robert J. Capettini, San Diego State A. Douglas Hillman, Drake University 


University Paul E. Solomon, San Jose State University 
Chairperson: Peter G. Dorff, Kent State Univer- Ellen Sweatt, DeKalb College-North 
sity-Stark Campus Lynn Paluska, Nassau Community College 


Roger À. Gee, San Diego Mesa College 


RESEARCH ADVISORY COMMITTEE 


Charge: To serve as the coordinating committee for Association activities that involve accounting 

research—specifically: 

1. To monitor the accounting research activities of other accounting associations (e.g., AICPA, FEI, 
TIA, and IMA). 

2. To advise the Director of Research in the development and administration of the Association’s 
programs in accounting research. 

3. To suggest ways to encourage research in accounting, especially applied research dealing with all 
accounting functions in enterprise settings. 

4. To provide effective research liaison with regions and sections. 

5. To co-ordinate the work of committees for the Competitive Manuscript Award, Doctoral Consor- 
tium, Financial Reporting Issues Conference, and other research-designated committees. 


Chairperson: Victor L. Bernard, University of James K. Loebbecke, University of Utah 
Michigan William F. Messier, Jr., University of Florida 

Julie H. Collins, University of North Carolina Krishna G. Palepu, Harvard University 

Robert N. Freeman, University of Texas at Judy D. Rayburn, University of Minnesota-Twin 
Austin Cities 

W. Bruce Johnson, University of Iowa Mark A. Wolfson, Stanford University 


Baruch Lev, University of California-Berkeley 


SECURITIES AND EXCHANGE COMMISSION LIAISON COMMITTEE 


Charge: To conduct activities as appropriate to assist communication and interaction between the SEC 

and the membership of the Association—specifically: 

1 . To prepare and transmit a response to the staff of the Securities and Exchange Commission on 
proposals soliciting views from the public concerning auditing and financial accounting and reporting. 

2. To make recommendations and to develop program(s) that will assist teaching of and conducting 
research on topics concerning the SEC and its activities. 

3. To provide input to the Financial Reporting Issues Conference Committee on the appropriate content 
and structure of the 1995 conference. 


Chairperson: Robert J. Sack, University of Jack L. Krogstad, Creighton University 
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Editor’s Note: Two copies of books for review should be sent to Dr. James Boatsman, 
School of Accountancy, Arizona State University, Tempe, AZ 85287-3606. The policy 
of the Review is to publish only those reviews solicited by the Book Review Editor. 
Unsolicited reviews will not be accepted. 


PETER H. KNUTSON, Financial Reporting in the 1990’s and Beyond, (Charlottesville: 
Association for Investment Management and Research, 1993, pp. viii, 98) 


Financial reporting policy issues constitute an increasingly contentious subject for the accounting 
profession and for society. Recently, the Financial Accounting Standards Board encountered significant 
differences of opinion among various constituents on topics such as mark-to-market accounting and 
employee stock options. The Securities and Exchange Commission continues to consider alternative 
financial reporting requirements to apply to foreign companies that register securities in the United States. 
And Congressional committees have deliberated on several issues thatdirecty pertain to financial reporting 
policies. 

If you want to learn more about the views of an important group on financial reporting policy issues, 
I recommend Peter Knutson’s report to you. The report was prepared by the Financial Accounting Policy 
Committee of the Association for Investment Management and Research (AIMR). AIMR’s membership 
includes over 24,000 securities analysts, portfolio managers, strategists, consultants, and other investment 
specialists. Over 15,000 of AIMR's members hold the chartered financial analyst (CFA) designation. 
Because of these characteristics, the views expressed in this report present the perspective of an important 
constituency that must be considered in any debate on financial reporting policies. 

The report provides a number of interesting insights on analysts’ views of financial reporting. Because 
the report is a compilation of positions on many aspects of financial reporting by many analysts, the report 
does not present one consistent, coherent model of financial reporting. Indeed, there are several apparent 
inconsistencies in the positions taken in the report. These inconsistencies, however, should not be viewed 
as limitations in the insights. Rather, they reflect the tradeoffs inherent in developing a financial reporting 
model or differences of opinion among AIMR members. This characteristic of presenting AIMR member 
views is perhaps the report's most important feature. 

There are a number of interesting issues that may be appropriately detailed in this review. I identified 
two that I think are representative: 

* The report addresses valuation of assets and liabilities in several sections. The report offers several 
views on historical cost and market valuation. For example, the report states “[b]ecause assets and 
liabilities are both the result of past transactions and events, so is the accounting measure of net 
worth. Financial analysis, on the other hand, assesses, estimates, and gauges value solely in terms 
of expectations of the future" (p.17). Later in the report, however, historic costs are described as 
"sunk costs and there is little disagreement that they are often irrelevant to financial decisions" 
(p.33). In discussing mark-to-market accounting, the report states: “AIMR members have 
different views on market values. Virtually all favor disclosure of market values, at least for 
financial instruments. No one seems to believe that disclosure alone would be detrimental to 
analysts' interests, and all but a few believe that disclosure would be beneficial. Most are opposed 
to replacing historic cost with market values, but a significant minority favors such a move. Most 
oppose extending mark-to-market accounting from financial assets to real assets, although a small 
number does not" (p.38). 
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* The report differentiates the roles of financial reporting and financial analysis. In one section, the 
report states that “[a]lthough both result in expressions of worth or value, their perspectives are 
diametrically opposed" (p.17). The report describes the perspective of financial reporting as 
presenting the economic history of an entity, while financial analysis is forward-looking. Another 
section of the report states that “when it comes to the valuation of business enterprises—-either 
singly, in groups, or by components— we rightfully regard that as the province of financial analysis 
and a matter beyond the scope of financial reporting" (p.48). 


The report presents views on a number of controversial issues, including mark-to-market accounting, 
accounting for intangible assets, disaggregated financial statements, interim financial statements, compre- 
hensive income, the statement of cash flows, and the standard setting process. Indeed, the report discusses 
many of the important issues facing accounting standard-setters today. 

The report is thought-provoking and informative. It states positions of an important constituency on 
many controversial financial reporting policy issues. I recommend it to anyone who wants to be well 
informed on financial accounting policy issues. 

J. RICHARD DIETRICH 


Deloitte & Touche Professor of Accountancy 
University of Illinois 
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EDITORIAL POLICY AND STYLE INFORMATION 


EDITORIAL POLICY 


According to the policies set by the Publications Committee (which were endorsed by the Executive 
Committee and were published in the Accounting Education News, June 1987), The Accounting Review 
“should be viewed as the premier journal for publishing articles reporting the results of accounting research 
and explaining and illustrating related research methodology. The scope of acceptable articles should 
embrace any research methodology and any accounting-related subject, as long as the articles meet the 
standards established for publication in the journal....No special sections should be necessary. The primary, 
but not exclusive, audience should be—as it is now —academicians, graduate students, and others interested 
in accounting research.” 

The primary criterion for publication in The Accounting Review is the significance of the contribution 
an article makes to the literature. 

The efficiency and effectiveness of the editorial review process is critically dependent upon the actions 
of both authors submitting papers and the reviewers. Authors accept the responsibility of preparing research 
papers at a level suitable for evaluation by independent reviewers. Such preparation, head should 
include subjecting the manuscript to critique by colleagues and others and revising it accordingly prior to 
submission. The review process is not to be used as a means of obtaining feedback at early stages of 
developing the research. 

Reviewers and associate editors are responsible for providing critically constructive and prompt 
evaluations of submitted research papers based on the significance of their contribution and on the rigor of 
analysis and presentation. Associate editors also make editorial recommendations to the editor. 


MANUSCRIPT PREPARATION AND STYLE 

The Accounting Review's manuscript preparation guidelines follow (with a slight modification) the B- 
format of the Chicago Manual of Style (13th ed.; University of Chicago Press). Another helpful guide to 
usage and style is The Elements of Zole, by William Strunk, Jr., and E. B. White (Macmillan). Spelling 
follows Webster’s International Dictionary. 


FORMAT 


1. All manuscripts should be typed on one side of 84 x 11 good quality paper and be double spaced, 
except for indented quotations. 

2. Manuscripts should be as concise as the subject and research method permit, generally not to exceed 
7,000 words. 

3. Margins of at least one inch from top, bottom, and sides should facilitate editing and duplication. 

4. To assure anonymous review, authors should not identify themselves directly or indirectly in their 
papers. Single authors should not use the editorial "we." 

5. A cover page should show the title of the paper, the author's name, title, and affiliation, any 
acknowledgments, and a footnote indicating whether the author would be willing to share the data (see 
last paragraph in this statement). 


Pagination: All pages, including tables, appendices, and references, should be serially numbered. The first 
section of the paper should be untitled and unnumbered. Major sections may be numbered in roman 
numerals. Subsections should not be numbered. 


Numbers: Spell out numbers from one to ten, except when used in tables and lists, and when used with 
mathematical, statistical, scientific, or technical units and quantities, such as distances, weights, and 
measures. For example: three days; 3 kilometers; 30 years. All other numbers are expressed numerically. 
Generally when using approximate terms spell out the number, for example, approximately thirty years. 


Percentages and Decimal Fractions: In nontechnical copy use the word percent in the text; in technical 
copy the symbol % is used. (See the Chicago Manual for discussion of these usages.) 


Hyphens: Use a hyphen to join unit modifiers or to clarify usage. For example: a well-presented analysis; 
re-form. See Webster’s for correct usage. 


Key Words: The abstract is to be followed by four key words that will assist in indexing the paper. 
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ABSTRACT / INTRODUCTION 


An Abstract of about 100 words should be presented on a separate page immediately preceding the text. 
The Abstract should concisely inform the reader of the manuscript’s topic, its methods and its findings. 
Keywords and the Data Availability statements should follow the Abstract. The text of the paper should start 
with a section labeled “L Introduction," which provides more details about the paper's purpose, motivation, 
methodology and findings. Both the abstract and the introduction should be relatively non-technical, yet 
clear enough for an informed reader to understand the manuscript’ s contribution. The manuscript's title, but 
neither the author's name nor other identification designations, should appear on the Abstract page. 


TABLES AND FIGURES 


The author should note the following general requirements: 

1. Each table and figure (graphic) should appear on a separate page and should be placed at the end of the 
text. Each should bear an Arabic number and a complete title indicating the exact contents of the table 
or figure. 

A reference to each graphic should be made in the text. 

The author should indicate by marginal notation where each graphic should be inserted in the text. 
Graphics should be reasonably interpreted without reference to the text. 

Scurce lines and notes should be included as necessary. 


Mosi Equations should be numbered in parentheses flush with the right-hand margin. 


WAN 


DOCUMENTATION 


Citations: Work cited should use the “author-date system" keyed to a list of works in the reference list 
(see below). Authors should make an effort to include the relevant page numbers in the cited works. 


1. In the text, works are cited as follows: authors’ last name and date, without comma, in parentheses: for 
example, (Jones 1987); with two authors: (Jones and Freeman 1973); with more than two: (Jones et al. 
1985); with more than one source cited together (Jones 1987; Freeman 1986); with two or more works 
by one author: (Jones 1985, 1987). 

2. Unless confusion would result, do not use “p.” or “pp.” before page numbers: for example, (Jones 1987, 
115). 

3. When the reference list contains more than one work of an author published in the same year, the suffix 
a, b, etc. follows the date in the text citation: for example, (Jones 1987a) or Jones 1987a; Freeman 
1985b). 

4. Ifan author’s name is mentioned in the text, it need not be repeated in the citation; for example, “Jones 
(1987, 115) says...." 

5. Citations to institutional works should use acronyms or short titles where practicable; for example, 
(AAA ASOBAT 1966); (AICPA Cohen Commission Report 1977). Where brief, the full title of an 
institutional work might be shown in a citation: for example, (ICAEW The Corporate Report 1975). 

6. Ifthe manuscript refers to statutes, legal treatises, or court cases, citations acceptable in law reviews 
sbould be used. 


Reference List: Every manuscript must include a list of references containing only those works cited. Each 
entry should contain all data necessary for unambiguous identification. With the author-date system, use the 
following format recommended by the Chicago Manual: 


1. Arrange citations in alphabetical order according to surname of the first author or the name of the 
institution responsible for the citation. 

Use author’s initials instead of proper names. 

Dates of publication should be placed immediately after author’s name. 

Titles of journals should not be abbreviated. 

Multiple works by the same author(s) should be listed in chronological order of publication. Two or more 
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works by the same author(s) in the same year are distinguished by letters after the date. 
6. Inclusive page numbers are treated as recommended in Chicago Manual section 8.67. 


Sample entries are as follows: 


American Accounting Association, Committee on Concepts and Standards for External Financial Reports. 
1977. Statement on Accounting Theory and Theory Acceptance. Sarasota, FL: AAA. 
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Cambridge, United Kingdom: Cambridge University Press. 
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dissertation, University of Texas, Austin. 

Sherman, T. M., ed. 1984. Conceptual Framework for Financial Accounting. Cambridge, MA: Harvard 
Business School. 


Footnotes: Footnotes are not used for documentation. Textual footnotes should be used only for extensions 
and useful excursions of information that if included in the body of the text might disrupt its continuity. 
Footnotes should be double spaced and numbered consecutively throughout the manuscript with superscript 
Arabic numerals. Footnotes are placed at the end of the text. 


SUBMISSION OF MANUSCRIPTS 
Authors should note the following guidelines for submitting manuscripts: 


1. Manuscripts currently under consideration by another journal or publisher should not be submitted. The 
author must state that the work is not submitted or published elsewhere. 

2. In the case of manuscripts reporting on field surveys or experiments, four copies of the instrument 
(questionnaire, case, interview plan, or the like) should be submitted. 

3. Fourcopies should be submitted together with a check in U.S. funds for $50.00 for members or $100.00 
for nonmembers of the AAA made payable to the American Accounting Association. Effective January 
1990, the submission fee is nonrefundable. 

4. The author should retain a copy of the paper. 

5. Revisions must be submitted within 12 months from request, otherwise they will be considered new 
submissions. 


COMMENTS 


Comments on articles previously published in The Accounting Review will be reviewed (anonymously) 
by two reviewers in sequence. The first reviewer will be the author of the original article being subjected 
to critique. If substance permits, a suitably revised comment will be sent to a second reviewer to determine 
its publishability in The Accounting Review. If a comment is accepted for publication, the original author 
will be invited to reply. All other editorial requirements, as enumerated above, also apply to proposed 
comments. 
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POLICY ON REPRODUCTION 


Anobjective of The Accounting Reviewis to promote the wide dissemination of the results of systematic 
scholarly inquiries into the broad field of accounting. 

Permission is hereby granted to reproduce any of the contents of the Review for use in courses of 
instruction, as long as the source and American Accounting Association copyright are indicated in any such 
reproductions. 

Written application must be made to the Editor for permission to reproduce any of the contents of the 
Review for use in other than courses of instruction—e.g., inclusion in books of readings or in any other 
publications intended for general distribution. In consideration for the grant of permission by the Review 
in such instances, the applicant must notify the author(s) in writing of the intended use to be made of each 
reproduction. Normally, the Review will not assess a charge for the waiver of copyright. 

Except where otherwise noted in articles, the copyright interest has been transferred to the American 
Accounting Association. Where the author(s) has (have) not transferred the copyright to the Association, 
applicants must seek permission to reproduce (for all purposes) directly from the author(s). 


POLICY ON DATA AVAILABILITY 


The following policy has been adopted by the Executive Committee in its April 1989 meeting. 

"An objective of (The Áccounting Review, Accounting Horizons, Issues in Accounting Education) is 
to provide the widest possible dissemination of knowledge based on systematic scholarly inquiries into 
accounting as a field of professional research, and educational activity. As part of this process, authors are 
encouraged to make their data available for use by others in extending or replicating results reported in their 
articles. Authors of articles which report data dependent results should footnote the status of data 
availability and, when pertinent, this should be accompanied by information on how the data may be 
obtained." 
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I. INTRODUCTION 


NALYSIS of earnings management often focuses on management’s use of discretionary 

accruals.! Such research requires a model that estimates the discretionary component(s) 

of reported income. Existing models range from simple models in which discretionary 
accruals are measured as total accruals, to more sophisticated models that attempt to separate total 
accruals into discretionary and nondiscretionary components. There is, however, no systematic 
evidence bearing on the relative performance of these alternative models at detecting earnings 
management. 

We evaluate the relative performance of the competing models by comparing the specifica- 
tion and power of commonly used test statistics. The specification of the test statistics is evaluated 
by examining the frequency with which they generate type I errors. Type I errors arise when the 
null hypothesis that earnings are not systematically managed in response to the stimulus identified 
by the researcher is rejected when the null is true. We generate type I errors for both a random 
sample of firm-years and for samples of firm-years with extreme financial performance. We focus 
on samples with extreme financial performance because the stimuli investigated in previous 
research are frequently correlated with financial performance. Thus, our findings shed light on 
the specification of test statistics in cases where the stimulus identified by the researcher does not 
cause earnings to be managed, but is correlated with firm performance. 

The power of the test statistics is evaluated by examining the frequency with which they 
generate type II errors. Type H errors arise when the null hypothesis that earnings are not 
systematically managed in response to the stimulus identified by the researcher is not rejected 
when it is false. We generate type II errors in two ways. First, we measure rejection frequencies 
for samples of firm-years in which we have artificially added a fixed and known amount of 
accruals to each firm-year. These simulations are similar to those performed by Brown and 
Warner (1980, 1985) in evaluating alternative models for detecting abnormal stock price 
performance. However, our simulations differ in several respects. In particular, we must make 
explicit assumptions concerning the components of accruals that are managed and the timing of 
the accrual reversals. To the extent that our assumptions are noi representative of the circum- 
stances of actual earnings management, our results lack external validity. To circumvent this 
problem, we generate type II errors for a second set of firms, for which we have strong priors that 
earnings have been managed.’ This sample consists of firms that have been targeted by the 
Securities and Exchange Commission (SEC) for allegedly overstating annual earnings. The 
external validity of these results rests on the assumption that the SEC has correctly identified firm- 
years in which earnings have been managed. This assumption seems reasonable, since the SEC 
(1992) indicates that out of the large number of cases that are brought to its attention, it only 
pursues cases involving the most significant and blatant incidences of earnings manipulation. 

The empirical analysis generates the following major insights. First, all of the models appear 
well specified when applied to a random sample of firm-years. Second, the models all generate 
tests of low power for earnings management of economically plausible magnitudes (e.g., one to 
five percent of total assets). Third, all models reject the null hypothesis of no earnings 


! Sec, for example, Healy (1985), DeAngelo (1986) and Jones (1991). Other constructs that have been used to detect 

earnings management include accounting procedure changes (Healy 1985; Healy and Palepu 1990; Sweeney 1994), 
. Specific components of discretionary accruals (McNichols and Wilson 1988; DeAngelo et al. 1994) and components 
of discretionary cash flows (Dechow and Sloan 1991). 

? Schipper (1989) defines earnings management as purposeful intervention in the external financial reporting process, 
with the intent of obtaining some private gain (as opposed to, say, merely facilitating the neutral operation of the 
process). In the spirit of Schipper’ s definition, our procedure assumes that reported earnings in the firm-ycars targeted 
by the SEC are higher than they would have been under the neutral application of GAAP. 
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management at rates exceeding the specified test-levels when applied to samples of firms with 
extreme financial performance.’ Finally, a version of the model developed by Jones (1991) that 
is modified to detect revenue-based earnings management generates the fewest type II errors. 

The paper is organized as follows. Section II outlines the statistical testing procedure used 
to detect earnings management and highlights the effects of model misspecification on statistical 
inference. Section III introduces the competing models for measuring discretionary accruals. The 
experimental design is described in section IV and the empirical results are analyzed in section 
V. Section VI concludes the paper and provides suggestions for future earnings management 
research. 


II. STATISTICAL ISSUES 


This section considers potential misspecifications in tests for earnings management and their 
impact on inferences concerning earnings management. The analysis builds on a related analysis 
in McNichols and Wilson (1988). Following McNichols and Wilson, accrual-based tests for 
earnings management can be cast in the following linear framework: 


K 
DA, =@+ B PART, + V y, X, tE, (1) 
where x 
» DA = discretionary accruals (typically deflated by lagged total assets); 
PART = a dummy variable partitioning the data set into two groups for which earnings 
management predictions are specified by the researcher; . 
X. = (for k=1, .., K) other relevant variables influencing discretionary accruals; and 
£ = an error term that is independently and identically normally distributed. 


In most research contexts, PART will be set equal to one in firm-years during which systematic 
earnings managementis hypothesized in response to the stimulus identified by the researcher (the 
"event period") and zero during firm-years in which no systematic earnings management is 
hypothesized (the "estimation period"). The null hypothesis of no earnings management in 
. response to the researcher's stimulus will be rejected if B, the estimated coefficient on PART, 
has the hypothesized sign and is statistically significant at conventional levels. 
Unfortunately, the researcher cannot readily identify the other relevant variables, (the X. s), 
and so excludes them from the model. Similarly, the researcher does not observe DA, and is forced 
to use a proxy, (DAP), that measures DA with error, (V): 


DAP, = DA, +0,- 
Thus, the correctly specified model can be expressed in terms of the researcher's proxy for 
discretionary accruals as 


K 
DAP, = x + BPART, + Y y, X,, +0, +E, (1) 
k=] 
This model can be summarized as: 
DAP, =a@+ BPART, + L, +£, (1") 


> The excessive rejection rates in the samples with extreme financial performance have two potential causes. First, 
'  nondiscretionary accruals (that are not extracted by the models) may becorrelated with firm performance. Second, other 
factors that are correlated with firm performance may cause earnings to be systematically managed. In the first case, 
the null hypothesis is falsely rejected because of correlated measurement error in the proxy for discretionary accruals. 
In the second case, the tests are correctly detecting earnings management, but the canse of earnings management is not 
known. Thus, if a researcher selects a stimuli that does not cause earnings to be managed but is correlated with firm 
performance, the test will be misspecified. We expand on these issues in section I. 
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where p captures the sum of the effects of the omitted relevant variables on discretionary accruals 
and the error in the researcher's proxy for discretionary accruals. Given the regular Gaussian 
assumptions,‘ the OLS estimate of B, ( B ), from a multiple regression of DAP on PART and p is 
the best unbiased estimator of B. Also, the ratio of ( B-P) to its standard error, SE( B, has a t- 
distribution, which can be used to test for earnings management. This framework therefore 
provides a benchmark for evaluating the case where u is omitted from the regression. 
The model of earnings management typically estimated by the researcher can be represented 
as 
DAP, = 4+bPART, +e,. (2) 


The researcher’s model is misspecified by the omission of the relevant variable u. Recall that 
the u can represent either measurement error in DAP or omitted relevant variables influencing 
DA. Estimating model (2) using OLS has two undesirable consequences:° 


(i) b is a biased estimator of B, with the direction of the bias being of the same sign as the 
correlation between PART and W; and 

(ii) SE (b) is a biased estimator of SE (f). In particular, if PART and p are uncorrelated, SE(b) 
will provide an upwardly biased estimate of SE ( B). 


These consequences lead to the following three problems for statistical inference in tests for 
earnings management: 


Problem 1: Incorrectly attributing earnings management to PART 


If the earnings management that is hypothesized to be caused by PART does not take 
place (i.e., the true coefficient on PART is zero) and uis correlated with PART, then the 
estimated coefficient on PART, will be biased away from zero, increasing the probability 
of a type I error. 

This problem will arise when (i) the proxy for discretionary accruals contains 
measurement error that is correlated with PART and/or (ii) other variables that cause 
earnings management are correlated with PART and are omitted from the analysis. In this 
latter case, earnings management is correctly detected by the model, but causality is 
incorrectly attributed to PART. 


Problem 2: Unintentionally extracting earnings management caused by PART 


If the earnings management that is hypothesized to be caused by PART does take 
place and the correlation between u and PART is opposite in sign to the true coefficient 
on PART, then the estimated coefficient on PART will be biased toward zero. This will 
increase the probability of a type II error. 

This problem will arise when the model used to generate the discretionary accrual 
proxy unintentionally removes some or all of the discretionary accruals. Under such 
conditions, the measurement error in the proxy for discretionary accruals (i.e., jj) will be 


* The required assumptions are (i) e, is distributed independent normal with zero mean and common variance, ož; ; and 
Gi) PART and p, are distributed independently of e, for all t and t. The assumption that the residuals are normally 
distri is not one of the original Gaussian assumptions. It is, however, required (i) for the OLS estimate to be the 
best of all (linear and nonlinear) unbiased estimators; and (ii) to derive the distribution of the test-statistic. Throughout 
the remainder of the paper, references to the Gaussian assumptions will therefore include the normality assumption. 

5 The derivation of these properties is identical to the standard derivation for the properties of OLS estimators in the case 
of the exclusion of a relevant regressor (e.g., Johnston 1984, 260—261). 
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negatively correlated with the discretionary accrual proxy, causing the coefficient on 
PART to be biased toward zero. 


Problem 3: Low power test 


If u is not correlated with PART, then the estimated coefficient on PART will not be 
biased. However, the exclusion of relevant (uncorrelated) variables leads to an inflated 
standard error for the estimated coefficient on PART. This will increase the probability of 
a type II error. 


We will refer back to each of these problems in our discussion of the models for detecting 
earnings management. 


IIl. MEASURING DISCRETIONARY ACCRUALS 


The usual starting point for the measurement of discretionary accruals is total accruals. A 
particular model is then assumed for the process generating the nondiscretionary component of 
total accruals, enabling total accruals to be decomposed into a discretionary and a nondiscretionary 
component. Most of the models require at least one parameter to be estimated, and this is typically 
implemented through the use of an "estimation period," during which no systematic earnings 
management is predicted. This paper considers five models of the process generating 
nondiscretionary accruals. These models are general representations of those that have been used 
in the extant earnings management literature. We have cast all models in the same general 
framework to facilitate comparability, rather than trying to exactly replicate the models as they 
may have appeared in the literature. 


The Healy Model 


Healy (1985) tests for earnings management by comparing mean total accruals (scaled by 
lagged total assets) across the earnings management partitioning variable. Healy's study differs 
from most other earnings management studies in that he predicts that systematic earnings 
management. occurs in every period. His partitioning variable divides the sample into three 
groups, with earnings predicted to be managed upwards in one of the groups and downward in 
the other two groups. Inferences are then made through pairwise comparisons of the mean total 
accruals in the group where earnings is predicted to be managed upwards to the mean total 
accruals for each of the groups where earnings is predicted to be managed downwards. This 
approach is equivalent to treating the set of observations for which earnings are predicted to be 
managed upwards as the estimation period and the set of observations for which earnings are 
predicted to be managed downwards as the event period. The mean total accruals from the 
estimation period then represent the measure of nondiscretionary accruals. This implies the 
following model for nondiscretionary accruals: 


NDA, --——, (4) 


where 


NDA = estimated nondiscretionary accruals; 

TA total accruals scaled by lagged total assets; 

t 1, 2,...T is a year subscript for years included in the estimation period; and 
T a year subscript indicating a year in the event period. 


Hou i 
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The DeAngelo Model 


DeAngelo (1986) tests for earnings management by computing first differences in total 
accruals, and by assuming that the first differences have an expected value of zero under the null 
hypothesis of no earnings management. This model uses last period’s total accruals (scaled by 
lagged total assets) as the measure of nondiscretionary accruals. Thus, the DeAngelo Model for 
nondiscretionary accruals Is: 

NDA, - TA, ,. | (5) 


The DeAngelo Model can be viewed as a special case of the Healy Model, in which the 
estimation period for nondiscretionary accruals is restricted to the previous year's observation. 

Acommon feature of the Healy and DeAngelo Models is that they both use total accruals from 
the estimation period to proxy for expected nondiscretionary accruals. If nondiscretionary 
accruals are constant over time and discretionary accruals have a mean of zero in the estimation 
period, then both the Healy and DeAngelo Models will measure nondiscretionary accruals 
without error. If, however, nondiscretionary accruals change from period to period, then both 
models will tend to measure nondiscretionary accruals with error. Which of the two models is 
more appropriate then depends on the nature of the time-series process generating nondiscretionary 
accruals. If nondiscretionary accruals follow a white noise process around a constant mean, then 
the Healy model is appropriate. If nondiscretionary accruals follow a random walk, then the 
DeAngelo model is appropriate. Empirical evidence suggests that total accruals are stationary in 
the levels and approximate a white noise process (e.g., Dechow 1994). 

The assumption that nondiscretionary accruals are constant is unlikely to be empirically 
descriptive. Kaplan (1985) points out that the nature of the accrual accounting process dictates 
that the level of nondiscretionary accruals should change in response to changes in economic 
circumstances. Failure to model the impact of economic circumstances on nondiscretionary - 
accruals will cause inflated standard errors due to the omission of relevant (uncorrelated) 
variables (problem 3). In addition, if the firms examined are systematically experiencing 
abnormal economic circumstances, then failure to model the impact of economic circumstances 
on nondiscretionary accruals will result in biased estimates of the coefficient on PART (problem 1). 


The Jones Model 


Jones (1991) proposes a model that relaxes the assumption that nondiscretionary accruals are 
constant. Her model attempts to control for the effect of changes in a firm's economic 
circumstances on nondiscretionary accruals. The Jones Model for nondiscretionary accruals in 
the event year is: 


NDA, = Q,(1/A, ;) + Q (AREV) + a,(PPE), (6) 
where 
AREV, = revenues in year 1 less revenues in year 1-1 scaled by total assets at t—1; 
PPE, = gross property plant and equipment in year t scaled by total assets at t—1; 
A. = total assets at t-1; and 


l 
a, Q,, & = firm-specific parameters. 
Estimates of the firm-specific parameters, Qt, €t, and &, are generated using the following 
model in the estimation period: 


TA, = a,(1/A,_,) + a (AREV) + a,(PPE) + 0, , (7) 


where 
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a,, a, and a, denote the OLS estimates of œ, o, and a, and TA is total accruals scaled by lagged 
total assets. The results in Jones (1991) indicate that the model is successful at explaining around 
one quarter of the variation in total accruals. 

An assumption implicit in the Jones model is that revenues are nondiscretionary. If earnings 
are managed through discretionary revenues, then the Jones Model will remove part of the 
managed earnings from the discretionary accrual proxy (problem 2). For example, consider a 
situation where management uses its discretion to accrue revenues at year-end when the cash has 
not yet been received and it is highly questionable whether the revenues have been earned. The 
result of this managerial discretion will be an increase in revenues and total accruals (through an 
increase in receivables). The Jones model orthogonalizes total accruals with respect to revenues 
and will therefore extract this discretionary component of accruals, causing the estimate of 
earnings managementto be biased toward zero. Jones recognizes this limitation of her model (see 
Jones 1991, footnote 31). 


The Modified Jones Model 


We consider a modified version of the Jones Model in the empirical analysis. The modifica- 
tion is designed to eliminate the conjectured tendency of the Jones Model to measure discretion- 
ary accruals with error when discretion is exercised over revenues. In the modified model, 
nondiscretionary accruals are estimated during the event period (i.e., during periods in which 
earnings management is hypothesized) as: . 


l NDA, = a,(1/A, j) + œ (AREV, — AREC,) + œ (PPE) , (8) 
where 


AREC, = net receivables in year T less net receivables in year t-1 scaled by total assets at t-1. 

The estimates of Q4, Q, &, and nondiscretionary accruals during the estimation period (in 
which no systematic earnings management is hypothesized) are those obtained from the original 
Jones Model. The only adjustment relative to the original Jones Model is that the change in 
revenues is adjusted for the change in receivables in the event period. The original Jones Model 
implicitly assumes that discretion is not exercised over revenue in either the estimation period or 
the event period. The modified version of the Jones Model implicitly assumes that all changes in 
credit sales in the event period result from earnings management. This is based on the reasoning 
that it is easier to manage earnings by exercising discretion over the recognition of revenue on 
credit sales than itis to manage earnings by exercising discretion over the recognition of revenue 
on cash sales. If this modification is successful, then the estimate of earnings management should 
no longer be biased toward zero in samples where earnings management has taken place through 
the management of revenues. 


The Industry Model : 


The final model considered is the Industry Model used by Dechow and Sloan (1991). Similar 
tathe Jones Model, the Industry Model relaxes the assumption that nondiscretionary accruals are 
constant over time. However, instead of attempting to directly model the determinants of 
nondiscretionary accruals, the Industry Model assumes that variation in the determinants of 
nondiscretionary accruals are common across firms in the same industry. The Industry Model for 
nondiscretionary accruals is: 


NDA, = Y, + y, median(TA,) , (9) 
where 
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median (TA,) = the median value of total accruals scaled by lagged assets for all non-sample 
firms in the same 2-digit SIC code.® 


The firm specific parameters y, and y, are estimated using OLS on the observations in the 
estimation period. 

The ability of the Industry Model to mitigate measurement error in discretionary accruals 
hinges critically on two factors. First, the Industry Model only removes variation in nondiscretionary 
accruals that is common across firms in the same industry. If changes in nondiscretionary accruals 
largely reflect responses to changes in firm-specific circumstances, then the Industry Model will 
not extract all nondiscretionary accruals from the discretionary accrual proxy. Second, the 
Industry Model removes variation in discretionary accruals that is correlated across firms in the 
same industry, potentially causing problem 2. The severity of this problem depends on the extent 
to which the earnings management stimulus is correlated across firms in the same industry. 


IV. EXPERIMENTAL DESIGN 
Sample Construction 


The empirical analysis is conducted by testing for earnings management using four distinct 
samples of firm-years as event-years: 


(1) arandomly selected sample of 1000 firm-years; 

(ii) samples of 1000 firm-years that are randomly selected from pools of firm-years experienc- 
ing extreme financial performance; 

(iii) samples of 1000 randomly selected firm-years in which a fixed and known amount of 
accrual manipulation has been artificially introduced; and 

(iv) a sample of 32 firms that are subject to SEC enforcement actions for allegedly overstating 
annual earnings in 56 firm-years. 


Sample (i) is designed to investigate the specification of the test statistics generated by the 
models when the measurement error in discretionary accruals ({1) is uncorrelated with the 
earnings management partitioning variable (PART). Because the earnings management parti- 
tioning variable is selected at random, itis expected to be uncorrelated with any omitted variables. 
Note that this is simply a test of whether the Gaussian assumptions underlying the regression are 
satisfied. The existence of uncorrelated omitted variables reduces the power of the test (problem 
3), but will not systematically bias the type I errors. 

The 1000 randomly selected firm-years are selected from the 168,771 firm-years on the 
COMPUSTAT industrial files with the necessary data between 1950 and 1991. The 1000 firm- 
years are selected in a sequential fashion and without replacement. A firm-year is not selected if 
its inclusion in the random sample leaves less than ten unselected observations for the estimation 
period. Selected firms have an average of 21.5 observations. The requirement of more than 10 
observations is necessary to efficiently estimate the parameters of the nondiscretionary accrual 
models for each firm. This sequential selection procedure continues until the random sample 
consists of 1000 firm-years. 

Sample (ii) is designed to test the specification of each model when the earnings management 
partitioning variable, PART, is correlated with firm performance. The earnings management 
stimulus investigated in many existing studies are correlated with firm performance. For 





$ The use of two-digit SIC levels represents a trade-off between defining industry groupings narrowly enough that the 
Industry Model captures the industry specific effects versus having enough firms in each industry grouping so that the 
model can effectively diversify firm-specific effects. 
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example, Healy (1995) hypothesizes that management reduce earnings when either earnings are 
below the lower bound or cash from operations are above the upper bound of top executive bonus 
plans. Researchers have also investigated whether management attempt to loosen debt covenant 
restrictions through their accrual choices (e.g., Defond and Jiambalvo 1994; DeAngelo et al. 
1994). Firms close to debt covenant restrictions are often experiencing poor earnings and/or cash 
flow performance. A final example is studies investigating accrual manipulation around non- 
routine management changes (e.g., Pourciau 1993; DeAngelo 1988). DeAngelo (1988) points out 
that poor prior earnings performance is often cited as a reason for management change. Thus, 
sample (ii) is used to examine the impact of firm performance on model misspecification. 

To investigate the estimates of discretionary accruals produced by the models when firm 
performance is unusual, firm-years are selected to have either extreme earnings performance 
or extreme cash from operations performance.’ A “high” and a “low” sample is formed for each 
of the performance measures, resulting in a total of four samples. These samples are formed using 
the following procedure. Each of the performance measures is standardized by lagged total assets. 
All firm-years with available data on the COMPUSTAT industrial files are then separately ranked 
on each performance measure. For each measure, firm-years are assigned in equal numbers to 
decile portfolios based on their ordered ranks. Each portfolio contains approximately 17,000 
firm-years. Samples of 1000 firm-years are randomly selected from the highest and lowest 
portfolios for each performance measure using the same procedure that was discussed for sample (1). 

Sample (iii) is designed to evaluate the relative frequency with which the competing models 
of nondiscretionary accruals generate type II errors. Brown and Warner (1980, 1985) investigate 
the type II errors of alternative models for measuring security price performance by artificially 
introducing a fixed and known amount of abnormal stock price performance into a randomly 
selected sample of firm-years. Inducing abnormal accruals is more complex than inducing 
abnormal stock returns for two reasons. First, we have to make explicit assumptions concerning 
the component(s) of accruals that are managed. This assumption is critical for the Jones Model, 
because if we introduce earnings management by artificially inflating revenues, then both 
accruals and revenue increase. The increase in revenue will affectthe estimate of nondiscretionary 
- accruals generated by the Jones Model. Second, since accruals must sum to zero over the life of 
the firm, artificially inducing discretionary accruals requires additional assumptions about the 
timing of the accrual reversals. Thus, we artificially introduce earnings management into sample 
(111), but recognize that the external validity of the results is contingent upon how representative 
our assumptions are of actual cases of earnings management. 

We obtain sample (iii) by beginning with the 1000 randomly selected firm-years in sample 
(i) and then adding accrual manipulation ranging in magnitude from zero percent to 100 percent 
of lagged assets (in increments of ten percent). In all cases, we assume that the accruals fully 
reverse themselves in the next fiscal year. We make three different sets of assumptions regarding 
the components of accruals that are managed: 


à Expense Manipulation - delayed recognition of expenses. This approach is implemented by 
adding the assumed amount of expense manipulation to total accruals in the earnings 
management year, and subtracting the same amount in the following year. Since none of the 
models use expenses to estimate nondiscretionary accruals, none of the other variables used 
in the study need to be adjusted. 


7 We focus on the most extreme deciles of each performance measure to generate powerful tests for possible performance 
related biases. Our samples are therefore likely to have more extreme performance than that occurring in specific 
earnings management studies. Thus, we expect the performance related misspecification to be more severe in our 
extreme decile samples. In additional tests (not reported) we confirm that the performance induced misspecifications 
are not limited to the extreme deciles. 
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(2) Revenue Manipulation - premature recognition of revenue (assuming all costs are fixed). 
This approach is implemented by adding the assumed amount of revenue manipulation to 
total accruals, revenue and accounts receivable. The same amount is subtracted from total 
accruals, revenue and accounts receivable in the following year; and 

(3) Margin Manipulation - premature recognition of revenue (assuming all costs are variable). 
This approach is implemented by adding the assumed amount of margin manipulation to 
total accruals and by adding the following to revenue and accounts receivable: 


(assumed amount of margin manipulation) / (net income ratio), 


where the net income ratio is the ratio of the firm's net income to revenue, estimated using 
the median value of the ratio from observations in each firm's estimation period. For 
example, to artificially introduce earnings management of one percent of lagged assets in a 
firm with a net income ratio of ten percent, we add one percent of lagged assets to total 
accruals and ten percent of lagged assets to revenue and accounts receivable. The same 
amounts are subtracted from total accruals, revenue and accounts receivable in the following 
year. 


The difference between assumptions (2) and (3) relate to the matching of expenses to 
manipulated revenues. Assumption (2) corresponds to 'pure' revenue manipulation, in which 
revenues are manipulated upwards, but expenses do not change. Assumption (3) corresponds to 
premature recognition of a sale in a setting where all costs are variable. Revenues are manipulated 
upwards, but expenses are matched to the manipulated revenues. The crucial difference between 
(2) and (3) is that (3) requires much greater revenue manipulation in order to achieve a given 
increase in earnings. Assumptions (2) and (3) are extremes on a continuum, and in practice, we 
would expect most revenue-based earnings management to lie between these two extremes. 

Interpretation of the type II errors for sample (iii) is contingent on the explicit assumptions 
that are made concerning how earnings are managed. In order to reinforce the external validity 
of our conclusions concerning type II errors, we examine a sample of firm-years for which we 
have strong a priori reasons to expect earnings management of a known sign. Sample (iv) consists 
of firm-years that are subject to accounting-based enforcement actions by the SEC. The SEC takes 
enforcement actions against firms and individuals having allegedly violated the financial 
reporting requirements of the securities laws. Since April 1982, the SEC has published the details 
of its major enforcement actions in a series of Accounting and Auditing Enforcement Releases 
(AAERs).8 

Enforcement actions in which the Commission alleges that a firm has overstated annual 
earnings in violation of Generally Accepted Accounting Principles (GAAP) are brought pursuant 
to Section 13(a).° A total of 134 firms are the subject of AAERs brought pursuant to Section 13(a). 
We further require that (i) each firm has at least ten years of the required financial statement data 
on the COMPUSTAT industrial files (excluding the years in which the alleged overstatements of 
earnings occurred); (ii) the AAER alleges that annual earnings have been overstated (many of the 
AAERs relate to overstatements of quarterly earnings that are reversed before the fiscal year end); 
and (iii) the AAER does not relate to a financial institution (since the current asset and current 
liability variables that we use to compute accruals are not available for these firms). These 


* Feroz etal. (1991) provide descriptive evidence on the AAERs and their financial and market effects. Pincus et al. (1988) 
describe the events leading to a formal SEC investigation and the publication of an AAER. 

? Section 13(a) requires issuers whose securities are registered with the Commission to file reports (including the annual 
financial statements on form 10-K) as specified by Commission rules and regulations. The financial statements 
contained in these filings are required to comply with Regulation $-X, which in turn requires conformity with GAAP. 
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restrictions result in a sample of 32 firms that are alleged to have overstated earnings in a total of 
56 firm-years. Fifteen of the sample firms are targeted for overstating revenue alone, three are 
targeted for overstating revenue in combination with understating expenses and the remaining 14 
firms are alleged to have understated a variety of expenses. 


Data Analysis . 


The empirical tests for earnings management follow from the regression framework 
developed in section II. The empirical tests are separately applied to each of the samples described 
above. The firm-years in each sample represent the event-years that are to be tested for earnings 
management. We therefore begin by matching each firm-year represented in a sample with the 
remaining non-event-years for that firm on COMPUSTAT to form the estimation period. The 
sample selection procedures ensure that all firms have at least ten observations in their estimation 
period. 

Consistent with previous studies of earnings management (Healy 1985 and Jones 1991), total 
accruals (TA), are computed as:'° 


TA, = (ACA, — ACL, — ACash, + ASTD, — Dep, (A) , 
where 
ACA = change in current assets (COMPUSTAT item 4); 
ACL = change in current liabilities (COMPUSTAT item 5); 
ACash = change in cash and cash equivalents (COMPUSTAT item 1); 
ASTD = change in debt included in current liabilities (COMPUSTAT item 34); 
Dep = depreciation and amortization expense (COMPUSTAT item 14); and 
A = Total Assets (COMPUSTAT item 6). 


Earnings is measured using net income before extraordinary items and discontinued 
operations (COMPUSTAT item 18) and is also standardized by lagged total assets. Cash from 
operations is computed as: 

Cash from operations = Earnings — TA. 


Using each of the competing models, discretionary accruals are then estimated by subtracting 
the predicted level of nondiscretionary accruals (NDAP) from total accruals (standardized by 
lagged total assets): 


- 10 
DAP, = TA, ~ NDAP... (19) 


To test for earnings management, the estimated discretionary accruals are regressed on the 
partitioning variable, PART. Recall that the regression pools across observations in the event 
period and the estimation period. PART is set equal to one if the observation is from the event 
period and zero if the observation is from the estimation period: 


DAP, = â, * b; PART, +e,. (11) 


The coefficient on PART, b, , provides a point estimate of the magnitude of the earnings 
management attributableto the stimulus represented by PART. The null hypothesis of no earnings 
management in response to this factor is tested by applying a t-test to the null hypothesis 


10 All data required to estimate the nondiscretionary accruals models and conduct the empirical analysis are initially 
obtained from the COMPUSTAT industrial files. Data for the 56 event-years in the SEC sample are manually checked 
to hard copies of the sample tirms' annual reports. In some of the cases where the SEC requires a firm to restate its 
earnings, we found that the COMPUSTAT files contained the restated numbers. In these cases, we substitute the original 
figures reported in the hard copies of the annual reports. 
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that b= 0.!! The null hypothesis that the average t-statistic is zero for the N firms in the sample 
is also tested by aggregating the individual t-statistics to form a Z-statistic: 


1 < tj 
Z — lea , 
nt JN à Jk, Kk, -2) , W-9 
t= t-statistic for firm j; and 
k, = degrees of freedom for t-statistic of firm j. 


The Z-statistic is asymptotically distributed unit normal if the ts are cross-sectionally indepen- 
dent. 


V. EMPIRICAL RESULTS 
Random Sample of Firm-Years 


Table 1 provides descriptive statistics on the parameter estimates and test statistics generated 
by each of the discretionary accrual models when applied to the sample of 1000 randomly selected 
firm-years. For each model, the row labeled “earnings management” represents the estimated 
coefficient on PART, (b, ), the row labeled “standard error” represents the standard error of this 
coefficient estimate, and the row labeled “t-statistic” represents the t-statistic for testing the null 
hypothesis that this coefficient is equal to zero. The mean and median values of earnings 
management are close to zero for all models indicating, as expected, that there is no systematic 
evidence of earnings management 1n the randomly selected event-years relative to years in the 
estimation period. The standard errors tend to be highest for the DeAngelo Model and lowest for 
the Jones and Modified Jones models, suggesting that the latter models are more effective at 
modeling the time-series process generating nondiscretionary accruals and suffer less from 
misspecifications caused by omitted determinants of nondiscretionary accruals. Note, however, 
that from a researcher’s perspective, the standard errors are high in all models. For example, the 
mean standard error exceeds 0.09 for all models. Earnings management would therefore have to 
exceed 18 percent of lagged assets before we would expect to generate a t-statistic greater than 
two for an individual firm. Alternatively, if a Z-statistic were computed for a sample of firms that 
had all managed earnings by one percent of total assets, over 300 firms would be required in the 
sample before the Z-statistic is expected to exceed two. Thus, none of the models are expected 
to produce powerful tests for earnings management of economically plausible magnitudes. 

Table 2 reports the incidence of type I errors for the sample of 1000 randomly selected firm- 
years using the conventional test levels of five percent and one percent. Since the earnings 
management partitioning variable is selected at random in this sample, it is expected to be 
uncorrelated with any omitted variables. Thus, the type I errors should correspond to the test levels 
applied, so long as the Gaussian assumptions are satisfied. Type I errors are reported for both the 
null hypothesis that discretionary accruals are less than or equal to zero and the null hypothesis 


I The computation of the standard error of b, requires special attention because the measures of discretionary accruals 
in the event period (estimation period) are prediction errors (fitted residuals) from a first-pass estimation process. An 
adjustment must therefore be made to reflect the fact that the standard errors of the prediction errors are greater than 
the standard errors of the fitted residuals. Likewise, the degrees of freedom in the t-test must reflect the degrees of 
freedom used up in the first-pass estimation. This can be accomplished by either explicitly adjusting the standard error 
and degrees of freedom of the prediction errors (see Jones 1991) or by estimating a single stage regression that includes 
both PART and the determinants of nondiscretionary accruals (see Dechow and Sloan 1991). The two approaches are 
econometrically equivalent and we therefore use the latter approach for its computational ease (see Salkever 1976 for 
an extended discussion and proof on this issue). 
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TABLE 1 


Results of Tests for Earnings Management Using Alternative Models to Measure 
Discretionary Accruals. The Results are Based on a Sample of 1000 Randomly 





Selected Firm-Years. 
Standard Lower Upper 

Model Mean Deviation Quartile Median Quartile 
Healy Model: 
camings management 0.002 1.241 -0.035 —0.001 0.040 
standard error 0.195 4.573 0.039 0.065 0.104 
t-statistic 0.012 1.174 —0.583 0.010 0.598 
DeAngelo Model: 
earnings management 0.002 0.151 —0.048 0.001 0.052 
standard error 0.281 6.799 0.054 0.090 0.143 
t-statistic 0.002 1.135 -0.577 0.018 0.637 
Jones Model: 
earnings management 0.001 0.118 -0.037 —0.001 0.036 
standard error 0.092 0.438 0.036 0.060 0.095 
t-statistic 0.013 1.155 —0.647 —0.022 0.644. 
Modified Jones Model: 
earnings management 0.002 0.119 —-0.035 0.001 0.041 
standard error 0.092 0.437 0.036 0.060 0.095 
t-statistic 0.062 1.204 -0.613 0.027 0.745 
Industry Model: 
earnings management 0.002 0.662 —0.032 0.000 0.039 
standard error 0.211 5.363 0.038 0.063 0.101 
t-statistic 0.028 1.165 —-0.555 0.006 0.637 
Notes: 


Earnings management represents the estimated coefficient on PART, (b, ), from firm-specific regressions of DAP, = 
a; +b, PART, +e,; where DAP is the measure of discretionary accruals produced by each of the models and PART is 
an indicator variable equal to 1 ina year in which earnings management is hypothesized to occur in response to the stimulus 
identified by the researcher and 0 otherwise. Standard error is the standard error of the coefficient on PART for each of 
the regressions and t-statistic is the t-statistic testing the null hypothesis that the coefficient on PART is equal to zero. 


that discretionary accruals are greater than or equal to zero. A binomial test is also conducted to 
assess whether the empirical rejection frequencies are significantly different from the specified 
test levels. The empirical rejection frequencies are close to the specified test levels for all models, 
and none of the differences are significant at conventional levels. Thus, all models appear well 
specified for a random sample of firm-years. 


Samples of Firm-Years Experiencing Extreme Financial Performance 


This section considers the four samples of firm-years experiencing extreme financial 
performance. The first two samples exhibit high and low earnings performance, respectively. 
Figure 1 contains plots in event time of earnings and its components for each of the two samples. 
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TABLE 2 
Comparison of the Type I Errors for Tests of Earnings Management Based on 
Alternative Models to Measure Discretionary Accruals. Percentage of 1000 Randomly 
Selected Firm-Years for which the Null Hypothesis of No Earnings Management is 


Rejected (One-Tailed Tests). 
Null Hypothesis Earnings management S0 Earnings management > 0 
Test Level: 5% 1% 5% 1% 
Healy Model: 
t-test 5.0% 1.3% 5.1% 1.4% 
DeAngelo Model: 
t-test 4.8 1.0 5.2 1.1 
Jones Model: 
t-test 4.9 1.4 5.9 1.5 
Modified Jones Model: 
t-test 4.9 1.3 5.9 1.3 
Industry Model: 


t-test 4.2 1.4 5.5 L2 


* Significantly different from the specified test level at the 5 percent level using a two-tailed binomial test. 
** Significantly different from the specified test level at the 1 percent level using a two-tailed binomial test. 


Year 0 represents the year in which the firm-years are selected based on their extreme earnings 
performance. There are separate plots for total accruals, cash from operations and earnings. Éach 
of the variables is scaled by lagged total assets and the median values for each of the two samples 
are shown in the plots. The bottom plot is of earnings performance. As expected, the high earnings 
performance sample gradually increases to a peak in year 0 and then declines thereafter. Similarly, 
the low earnings performance sample gradually declines to a trough in year 0 and then increases 
thereafter. The total accruals and cash from operations plots mirror the earnings plots, though the 
peaks and troughs are less extreme. This reflects the fact that earnings is the sum of cash from 
operations and accruals. Firms with high earnings tend to have high cash flows and high accruals. 
Similarly, firms with low earnings tend to have low cash flows and low accruals. 

Table 3 reports the rejection frequencies for tests of earnings management in response to the 
stimulus represented by PART. Since PART is measured by randomly selecting firms with 
extreme earnings performance, PART is constructed so that it is not itself a causal determinant 
of earnings management (although it may be imperfectly correlated with causal determinants). 
Thus, we have constructed a scenario which is analogous to the case where a researcher has 
selected a stimulus that is correlated with firm performance, but where the stimulus is not itself 
a causal determinant of earnings management. As such, any rejections of the null hypothesis of 
no earnings management represent type I errors. However, these results do not permit a direct 
assessment of the extent of misspecification in existing studies. Such an assessment requires a 
detailed reexamination of the stimulus in question [e.g., Holthausen et al. 1995]. 
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FIGURE 1 
Time Series of Median Annual Total Accruals, Cash from Operations and Earnings all 
Standardized by Lagged Total Assets. Year 0 is the Year in which Firm-Years are 
Selected from the Lowest and Highest Decile of Earnings Performance. Sample Consists 
of 1000 Firm-Years Randomly Selected from Firm-Years in the Lowest and Highest 
Decile of Earnings Performance. 


Median total accruals 


Median cash from operations 
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TABLE 3 
Comparison of the Type I Errors for Tests of Earnings Management Based on 
Alternative Models to Measure Discretionary Accruals. Percentage of 1000 
Firm-Years Randomly Selected from Firm-Years in the Lowest Decile and Highest 
Decile of Earnings Performance for which the Null Hypothesis of No Earnings 
Management is Rejected (One-Tailed Tests). 
Null Hypothesis Earnings management S 0 Earnings management 20 
Test Level: 5% 1% 5% 1% 


Panel A: Lowest decile of earnings performance 


Healy Model: 


t-test 1.7%** 0.49" 25.99," 9.995" * 
DeAngelo Model: 

t-test 2.8"* 0.4" 15 3137 
Jones Model: 

t-test s ai 0.7 16.6^* 5.4°* 
Modified Jones Model: 2x 2i 

t-test 2.7 0.6 17.6 6.5^* 
Industry Model: m : i " 
t-test 1.7 0.4 22.6 8.4 


Panel B: Highest decile of earnings performance 


Healy Model: 

t-test 12.8% *"* 4.2%"* 4.4% 1.4% 
DeAngelo Model: : 

t-test 9.5 1.6 4.2 0.9 
Jones Model: 

t-test 6.5" 1.3 6.3 14 
Modified Jones Model: - : 

t-test - 7.6 1.8 5.5 1.4 
Industry Model: 

t-test 10.3* 207 44 1.5 


* Significantly different from the specified test level at the 5 percent level using a two-tailed binomial test. 
™ Significantly different from the specified test level at the 1 percent level using a two-tailed binomial test. 


Panel A of table 3 reports the results for the low earnings performance sample. The proportion 
of type I errors for tests of the null hypothesis that earnings management S 0 are all less than the 
specified test levels and many of the differences are statistically significant. Conversely, the 
proportion of type I errors for tests of the null hypothesis that earnings management 2 O are 
appreciably greater than the corresponding test levels and the differences are statistically 
significant in all cases. For example, using a test level of five percent results in rejection rates 
ranging from 13.5% for the DeAngelo Model to 25.9% for the Healy Model. The high rejection 
rates arise because firm-years with low earnings also tend to have low total accruals and all the 
models attribute part of the lower accruals to negative discretionary accruals. Thus, the null 
hypothesis that earnings are not managed in response to the stimulus represented by PART is 
rejected in favor of the alternative that earnings are managed downwards. 
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Panel B of table 3 reports rejection frequencies for the sample of firm-years selected on the 
basis of high earnings performance. In this case, the results are opposite to those for the low 
earnings performance sample. The null hypothesis that earnings management 2 0 is rejected at 
rates similar to those reported for the random sample in table 2. However, the null hypothesis that 
earnings management S 0 is rejected at rates that are appreciably greater than the specified test 
levels and the differences are statistically significantin nearly all cases. For example, the test level 
of five percent yields rejection rates ranging from 6.5% forthe Jones Model to 12.8% for the Healy 
Model. This reflects the fact that firm-years with high earnings tend to have high accruals and the 
models of nondiscretionary accruals do not completely extract the higher accruals. In both panels 
A and B, the misspecifications are less severe for the Jones and Modified Jones models than for 
the Healy Model. This is consistent with part of the systematic behavior in accruals being 
extracted by these more sophisticated models. 

The results reported in panels A and B of table 3 are open to two interpretations (see the 
discussion of problem 1 in section ID): (1) Earnings performance is correlated with the error in 
measuring discretionary accruals (i.e., earnings performance is correlated with nondiscretionary 
accruals that are not completely extracted by any of the models); and/or (11) earnings performance 
is correlated with other variables that cause earnings to be managed. If a researcher selects a 
stimulus that does not cause earnings to be managed but is correlated with earnings performance, 
then the tests for earnings management will generate excessive type I errors. That is, using the 
models evaluated here, the researcher will detect low discretionary accruals when earnings are 
low and high discretionary accruals when earnings are high, even if the cause of the earnings 
management is not the stimulus investigated by the researcher. 

The evidence in table 3 suggests that before attributing causation to the investigated stimulus, 
the researcher should ensure that the results are not induced by omitted variables correlated with 
earnings performance. Holthausen et al. (1995) illustrate this point in their extension of Healy's 
(1985) paper on executive bonus plans. They conclude that Healy's lower bound results are 
induced by the correlation between his partitioning variable and earnings performance and that 
Healy prematurely attributes the earnings management to bonus plans. We provide further 
discussion of this problem in section VI. 

The second two samples of firm-years are selected on the basis of high di from operations 
and low cash from operations performance, respectively. Event time plots for these two samples 
of firms are provided in figure 2. The middle plot is of cash from operations. As expected, the high 
cash from operations sample climbs to a peak in year 0 and declines thereafter. The low cash from 
operations sample exhibits the opposite behavior, falling to a trough in year 0 and improving 
thereafter. The bottom plot is of earnings, which follow a similar, though less pronounced pattern 
to cash from operations. The top plot is of total accruals and is markedly different from the other 
two plots. In every year except for the event-year, total accruals are very similar for the two 
samples. In the event-year, the low cash from operations firms experience a sharp increase in total 
accruals, while the high cash from operations firms experience a sharp decrease in total accruals. 
The event-year accrual changes are opposite in sign, but about half as large as the corresponding 
changes in cash from operations. These results are consistent with the findings of Dechow (1994), 
who hypothesizes that this negative correlation results from the application of the matching 
principle under accrual accounting. Dechow's evidence suggests that the event-year accrual 
changes represent nondiscretionary accruals that are made with the objective of eliminating 
temporary mismatching problems in cash from operations. If matching is the cause ofthe negative 
correlation, then a well specified model of nondiscretionary accruals should control for this effect. 
However, the results in table 4 indicate that existing models do not completely control for this 
negative correlation. 
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FIGURE 2 


Time Series of Median Annual Total Accruals, Cash from Operations and Earnings all 
Standardized by Lagged Total Assets. Year 0 is the Year in which Firm-Years are 
Selected from the Lowest and Highest Decile of Cash from Operations. Sample Consists 
of 1000 Firm-Years Randomly Selected from Firm-Years in the Lowest and Highest 


Median total accruals 


Median cash from operations 


Median earnings 


Decile of Cash from Operations. 
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Table 4 reports the proportion of type I errors for the high and low cash from operations 
samples. Panel A indicates that the low cash from operations sample generates type I errors that 
are all significantly greater than the specified test levels for the null hypothesis that earnings 
management S 0. For example, at the five percent test level the rejection frequencies range from 
alow of 32.9% for the DeAngelo Model to a high of 46.7% for the Healy Model. This stems from 
the regularity documented in figure 2 that firms with low cash from operations tend to have high 
total accruals. The opposite problem is observed when testing the null hypothesis that earnings 
management 2 0. Because total accruals tend to be high, discretionary accruals generated by the 
various models tend to be high, and the frequency of type I errors tend to be lower than the 
specified test levels. 

Panel B of table 4 reports results for the high cash from operations sample. Recall that the high 
cash from operations sample has low total accruals in event year 0. The results for this sample 
indicate that the null hypothesis that earnings management < 0 tends to be under-rejected relative 
to the specified test levels, while the null hypothesis that earnings management 2 0 tends to be 
over-rejected. The over-rejections are most serious for the Healy Model, 50.0%. These results 
illustrate the problem faced by Healy (1985) in his upper bound tests. Healy hypothesizes that the 
executives of firms in which cash from operations exceeds the upper bounds specified in their top 
executive bonus plans manage earnings downwards. However, panel B illustrates that estimated 
discretionary accruals generally tend to be low for firms with high cash flows. The upper bound 
results reported in Healy’s table 2 are therefore likely to overstate the amount of earnings 
management that takes place at the upper bound. Healy recognizes this potential problem and 
controls for it through the use of a control sample in his table 4 results. 

More generally, any earnings management study in which the stimulus under investigation 
is correlated with cash flow performance is likely to produce misspecified tests. For example, 
Gaver et al. (1995) replicate Healy’s lower bound results using nondiscretionary earnings to 
classify firms relative to the lower bounds specified in their executive bonus plans. Gaver et al. 
measure nondiscretionary earnings as the sum of cash from operations plus nondiscretionary 
accruals, as generated by the Jones model. The resulting measure of nondiscretionary earnings 
is highly positively correlated with cash from operations (the mean Pearson correlation exceeds 
0.8). Thus, their tests are likely to suffer from the misspecification demonstrated in panel A of 
table 4.!? In particular, the lower bound sample is biased toward rejecting the null hypothesis that 
discretionary accruals are less than or equal to zero in favor of the alternative hypothesis that 
accruals are managed upwards. This result is documented by Gaver et al. and attributed to 
managerial “smoothing” of earnings. 


Samples of Firm-Years with Artificially Induced Earnings Management 


The results of the simulations using artificially induced earnings management are summa- 
rized in figures 3 and 4. Figure 3 provides information concerning bias in the estimates of earnings 
management produced by the competing models. For the sake of parsimony, we provide plots for 
only three models: the Healy Model; the Jones Model; and the Modified Jones Model. The results 
for the DeAngelo and Industry models are indistinguishable to those documented for the Healy 
and Modified Jones models. For each model and for each assumed source of earnings manipu- 


Tn additional tests (not reported) we reestimated the table 4 results using the Gaver et al. (1995) measure of 
nondiscretionary earnings in place of cash from operations. The results confirm that the low nondiscretionary earnings 
sample over-rejects the nuli hypothesis that discretionary accruals are less than or equal to zero in favor of the alternative 
hypothesis that they are greater than zero. For example, the Jones model (which is used by Gaver et. al.) rejects the null 
hypothesis that earnings management is less than or equal to zero 37.196 (17.496) of the time using a five percent (one 
percent) test level. 
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TABLE 4 


Comparison of the Type I Errors for Tests of Earnings Management Based on 
Alternative Models to Measure Discretionary Accruals. Percentage of 1000 Firm-Years 
Randomly Selected from Firm-Years in the Lowest and Highest Decile of Cash From 
a ipa: Performance for which the Null Hypothesis of No Earnings Management is 


Rejected (One-Tailed Tests). 
Null Hypothesis Earnings management < 0 Earnings management 2 0 
Test Level: 5% 196 5% 1% 


Panel A: Lowest decile of cash from operations performance 


Healy Model: "E us 3» 

t-test 46.7% 24.196 1.2% 0.3«** 
DeAngelo Model: 2 i T oa 
t-test 32.9 12.4 1.0 0.2 
Jones Model: T 2s 2s 

t-test 42.9 19.2 12 0.5 
Modified Jones Model: 

t-test 44.5"* 217^" 1.1 0.5 
eiie id 45.0"" 224^ I2 0.2** 


Panel B: Highest decile of cash from operations performance 


Healy Model: ** ** ** kk 
t-test 0.0% 0.0% 50.0% 23.9% 
DeAngelo Model: i 2» sd xk 
t-test 0.5 0.1 312.6 12.4 
Jones Model: ik ck ** ok 
t-test 0.3 0.1 46.2 19.9 
Modified Jones Model: 2 - " x 
t-test 0.3 0.1 46.4 20.3 
Industry Model; ek -— -— Wok 
t-test 0.2 0.0 46.7 21.9 


* Significantly different from the specified test level at the 5 percent level using a two-tailed binomial test. 
"' Significantly different from the specified test level at the 1 percent level using a two-tailed binomial test. 


lation, we provide a plot of detected earnings management (vertical axis) against induced earnings 
management (horizontal axis). Since our simulations are based on a large number of independent 
observations, an unbiased estimator is expected to resultin a45 degree line (i.e., detected earnings 
management is expected to equal induced earnings management). In each graph, the thin line 
represents the 45 degree line that would be generated by an unbiased estimator, and the thick line 
represents the results of our simulations. 

The first column of graphs provides results for artificially induced expense manipulation. 
The thick line lies atop the thin line in all cases, indicating that all models provide unbiased tests 
of expensed-based earnings management. The second column of graphs provides results for 
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artificially induced revenue manipulation. It is evident that the estimates of earnings management 
provided by the Jones Model are biased downward. The change in revenue is used as an 
independent variable to extract nondiscretionary accruals in the Jones Model, thereby extracting 
part of the revenue-based earnings management. The magnitude of the bias indicates that 
approximately one-quarter of the induced earnings management is not detected. The Modified 
Jones Model does not suffer from this bias. The third and final column presents the results for 
artificially induced margin manipulation. Again, only the Jones Model produces biased estimates 
of discretionary accruals. The downward bias is approximately one-third of the induced earnings 
management and is more serious than for the case of revenue manipulation because margin 
manipulation requires a larger amount of revenue management for a given amount of earnings 
management. i 

Figure 4 provides information concerning the relative power of the alternative models for 
detecting earnings management. These graphs plot the frequency with which the null hypothesis 
of no earnings management is rejected (vertical axis) against the magnitude of the induced 
earnings management (horizontal axis). A separate graph is provided for each model and for each 
assumed source of earnings manipulation. All rejection rates are computed at the five percent 
level using a one-tailed test.!? The first graph reports the power function for the Healy Model (thin 
line). Healy’s power function is also provided in the graphs of the remaining models to provide 
a benchmark for evaluating their relative power. The power functions for the remaining models 
are presented using the thicker lines. 

The first column of graphs provide the power functions for expense manipulation. The 
DeAngelo Model lies substantially below the Healy Model because the standard errors of the 
estimate of earnings management (table 1) tend to be significantly higher for the DeAngelo 
Model. The Jones, Modified Jones and Industry models are all slightly more powerful than the 
Healy Model. Again, this arises because they have slightly lower standard errors. Though it is not 
readily apparent from the graphs, the Jones and Modified Jones models are more powerful than 
the Healy and Industry models for alllevels of induced earnings management. The second column 
of graphs provides results for revenue-based earnings management. The only significant change 
from the preceding column is that the power function for the Jones Model now lies below that of 
the Healy Model, dueto the bias results in figure 3. The Jones Model unintentionally extracts some 
of the revenue-based earnings management leading to a downwardly biased estimate of earnings 
management and correspondingly reducing the power of the test. The Modified Jones Model 
continues to dominate the other models. The third and final column provides the results for 
margin-based earnings management. The only significant change in this columnis that the power 
of the Jones Model drops even further due to the downwardly biased estimate of earnings 
management. The Modified Jones Model still dominates all the other models, although it only 
dominates the Industry Model by a small margin. It should, however, be noted that the odds are 
stacked in favor of the Industry Model. We have implicitly assumed that earnings management 
is not clustered by industry (1.e., when we induce earnings management in a firm-year, we do not 
induce earnings management in the industry matched firm-years). To the extent that this 
assumption is violated, the power of the tests based on the Industry Model are overstated in our 
simulations. 


P? We replicated the results using a one percent test level. The relative rankings of the models are identical. We also 
performed identical tests assuming accruals are downwardly managed. The tenor of the bias and power results is 
unchanged. 
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Sample of Firm-Years in which the SEC Alleges Earnings are Overstated 


Figure 5 provides event time plots of total accruals, cash from operations and earnings for the 
sample of 32 firms alleged by the SEC to have overstated earnings. Year 0 represents the year in 
which the SEC alleges that earnings are overstated.'* To provide a benchmark for comparison, 
plots are also provided for the sample of 1000 randomly selected event-years. The plot of median 
total accruals indicates that accruals are abnormally high in the years leading up to and including 
year 0 and are abnormally low thereafter. The fact that total accruals are higher for the SEC sample 
relative to the random sample in event-year 0 is consistent with the joint hypothesis that total 
accruals measure discretionary accruals and that discretionary accruals are positive. The plot also 
reveals a sharp decline in accruals in event year one, which is consistent with the managed 
accruals reversing. 

The cash from operations plot indicates that cash flows tend to be slightly lower than normal 
for the SEC sample. The earnings plot indicates that earnings are close to the random sample in 
the years up to and including event-year 0, and substantially lower thereafter. Thus, the 
abnormally high accruals in years —5 through 0 have the effect of masking the lower cash flows 
and inflating reported earnings. This is consistent with management attempting to delay a decline 
in reported earnings through accrual management. 

Table 5 summarizes the results from tests of earnings management using the alternative 
models to generate discretionary accruals. For each model of discretionary accruals, the table 
reports descriptive statistics on the estimates of earnings management, their standard érrors and 
t-statistics, along with the aggregate Z-statistic. The Z-statistic is positive and highly statistically 
significant at conventional levels for all five models, supporting the hypothesis that earnings have 
been managed upwards. The statistic is the largest for the Modified Jones Model (5.76) followed 
by the Industry Model (5.00), the Healy Model (3.90), the Jones Model (3.69) and the DeAngelo 
Model (2.88). A comparison of the point estimates of earnings management and their associated 
standard errors permits the source of the differences in the Z-statistics to be examined. The Jones 
and Modified Jones Models have standard errors that are markedly lower than the other models. 
This reinforces our previous findings from table 1 that the Jones and Modified Jones Models are 
more successful at explaining variation in accruals. The lower standard errors explain the source 
of their power. The low power of the Jones Model relative to the Modified Jones Model stems 
from its smaller estimates of earnings management. These smaller estimates are consistent with 
the SEC sample including firms that overstate revenues and these overstatements not being 
detected by the Jones Model. This reason is investigated in more detail in table 6. Finally, the 
relatively high Z-statistic for the Industry Model stems from a combination of a high point 
estimate of earnings management relative to the Jones Model and a low standard error relative to 
the Healy and DeAngelo Models." 


M Some firms are alleged to have overstated earnings for two or more consecutive years. In figure 5, event year 0 pools 
across all observations for which overstatement is alleged, event year —1 is the year prior to the first year in which 
overstatement is alleged, and event year +1 is the year following the last year in which overstatement is alleged. Note 
that in the regression analysis, PART is coded as one in years when earnings management is alleged and zero otherwise. 

5 Firms subsequently restate earnings in 39 of the 56 firrh-ycars in which earnings overstatement is alleged by the SEC. 
These 39 observations provide us with an opportunity to investigate the extent of earnings management detected by the 
models compared to that identified by the SEC. The mean (median) restatement is 4.6 (2.3)% of assets. The mean 
(median) detected earnings management as a percent of assets for the Healy Model is 14.7 (5.6); the DeAngelo Model 
is 14:6 (2.3); the Jones Model is 10.5 (5.3); the Modified Jones Model is 15.9 (7.1); and the Industry Modelis 15. 4(8. D. 
These results are consistent with either (i) the SEC identifying or requiring only a subset of the total earnings 
management to be restated by the firms; or (ii) the models systematically overstating the magnitude of earnings 
management in this sample. 
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FIGURE 5 


Time Series of Median Annual Total Accruals, Cash From Operations and Earnings all 
Standardized by Lagged Total Assets. Year 0 is the Year in which the SEC Alleges that 
the Firm has Overstated Earnings. The SEC Sample Consists of 32 Firms Identified by 
the SEC for Overstating Annual Earnings. The Random Sample Consists of 1000 
Randomly Selected Firm Years. 
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TABLE 5 . 
Results of Tests for Earnings Management Using Alternative Models to Measure 
Discretionary Accruals. Sample of 32 Firms Targeted by the SEC in Accounting 
and Auditing Enforcement Releases (AAERs) between 1982 and 1992 for 


Allegedly Overstating Earnings. 
Model Standard Lower Upper 

Mean Deviation Quartile Median Quartile 
Heaty Model: 
earnings management 0.236 0.475 —0.022 0.058 0.258 
standard error 0.203 0.255 0.084 0.126 0.201 
t-statistic 0.760 1.310 —0.258 0.670 1.606 
Z-statistic = 3.90" 
DeAngelo Model: 
earnings management 0.278 0.581 —-0.011 0.089 0.310 
standard error 0.269 0.277 0.118 0.168 . 0.279 
t-statistic 0.564 0.907 0.088 0.467 1.224 
Z-statistic = 2.88" 
Jones Model: 
eamings management 0.138 . 0.374 -0.023 0.061 0.172 
standard error 0.158 0.183 0.075 0.105 0.158 
t-statistic 0.754 1.414 —0.165 0.675 1.744 
Z-statistic = 3.69" ^ 
Modified Jones Model: 
earnings management 0.171 0.333 0.002 0.083 0.284 
standard error 0.136 0.103 0.070 0.106 0.156 
t-statistic 1.193 1.991 0.086 0.895 2.020 
Z-statistic = 5.76" | 
Industry Model: 
earnings management 0.218 0.418 -0.015 0.090 0.280 
standard error 0.198 0.257 0.073 0.123 0.227 
t-statistic 0.972 1.498 -0.123 1.038 1.488 


Z-statistic = 5.00*" 


Notes: 

Earnings management represents the estimated coefficient on PART, (b; ), from firm-specific regressions of DAP, = 
a; +b, PART, +e,; where DAP is the measure of discretionary accruals produced by each of the models and PART is 
an indicator variable equal to 1 in a year in which earnings management is hypothesized to occur in response to the 
stimulus identified by the researcher and 0 otherwise. Standard error is the standard error of the coefficient on PART for 
each of the regressions and t-statistic is the t-statistic testing the null hypothesis that the coefficient on PART is equal to 


Zero. 
Significantly different from zero at the 1 percent level using a two-tailed test. 
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TABLE 6 


Results of Tests for Earnings Management Using Alternative Models to Measure 
Discretionary Accruals. Comparison of the Jones and Modified Jones Models on the 
SEC Sample Stratified by the Source of the Alleged Earnings Overstatement. Sample 
of 32 Firms Targeted by the SEC in Accounting and Auditing Enforcement Releases 
(AAERs) between 1982 and 1992. 


Standard Lower Upper 
Model Mean Deviation Quartile Median Quartile 


Panel A: Sample consists of 18 firms managing revenues 


Jones Model: 

earnings management 0.005 0.185 —-0.03C 0.038 0.095 
Z-statistic = 1.56 

Modified Jones Model: 

earnings management 0.091 0.288 0.009 0.074 0.183 


Z-statistic = 3.88** 


Panel B: Sample consists of 14 firms not managing revenues 


Jones Model: 

earnings management 0.310 0.482 -0.017 0.122 0.513 
Z-statistic = 3.80** 

Modified Jones Model: 

earnings management 0.274 0.368 —0.005 0.118 0.515 


Z-statistic = 4.31" " 


Notes: 

Earnings management represents the estimated coefficient on PART, ( b; ), from firm-specific regressions of DAP, = 
a; +b, PART, + e,; where DAP is the measure of discretionary accruals produced by each of the models and PART is 
an indicator variable equal to 1 in a year in which earnings management is hypothesized to occur in response to the 
stimulus identified by the researcher and 0 otherwise. 

“ Significantly different from zero at the 1 percent level using a two-tailed test. 


Table 6 provides an analysis of the impact of revenue-based earnings management on the 
performance of the Jones Model. The sample is stratified by the source of the earnings 
overstatement that is alleged by the SEC. Fifteen of the sample firms are accused of overstating 
revenues alone. A further three firms are accused of overstating revenues in combination with 
understating expenses. The remaining 14 firms are accused of understating expenses. We form 
two samples consisting of the 18 firms that are alleged to have overstated revenues and the 14 
firms for which no overstatement of revenues is alleged. Table 6 reports the results of tests for 
earnings management applied to each of these two samples using the Jones and Modified Jones 
Models. 

Panel A of table 6 reports the results for the sample for which revenue overstatements are 
alleged. The Z-statistic of 1.56 for the Jones Model is insignificantly different from zero at 
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conventional levels, while the Z-statistic of 3.88 for the Modified Jones Model is highly 
significant. Inspection of the earnings management estimates for these two models indicates that 
the higher Z-statistic for the Modified Jones Model results from substantially larger estimates of 
earnings management. The mean (median) estimate of earnings management is 0.5% (3.8%) of 
lagged assets for the Jones Model and 9.196 (7.496) of lagged assets for the Modified Jones Model. 
Panel B of table 6 reports results for the sample for which no revenue-based overstatements of 
earnings are alleged. The Z-statistics of 3.80 for the Jones Model and 4.31 for the Modified Jones 
Model are similar and statistically significant. Further inspection reveals that the earnings 
management estimates are also very similar. Thus, consistent with the results from our artificially 
managed samples, the two models appear to perform similarly in detecting expense-based 
earnings management. Overall, the results in table 6 provide confirmatory evidence that the 
Modified Jones Model is more powerful than the Jones Model in the presence of revenue-based 
earnings management. 

The results in tables 5 and 6 provide descriptive evidence on the relative performance of the 
alternative models for measuring discretionary accruals. The results in table 7 directly investigate 
the frequency of type II errors for the competing models. Table 7 reports the proportion of the 
firms in the SEC sample for which the null hypothesis that discretionary earnings is less than or 
equal to zero is rejected. If it is assumed that all models are well specified and that the SEC has 
correctly identified firms that managed earnings, then the proportions of rejections in table 7 
provide estimates of the relative power of the tests. The results indicate that the Modified Jones 
Modei rejects the null hypothesis most frequently, followed by the Industry Model, the Jones 
Model, the Healy Model and the DeAngelo Model. These rankings correspond closely to the 
rankings of the power functions obtained in the simulation tests and reinforce the documented 
superiority of the Modified Jones Model. 


VI. CONCLUSIONS AND IMPLICATIONS 


This paper evaluates the ability of alternative models to detect earnings management. The 
results suggest that all the models considered appear to produce reasonably well specified tests 
for a random sample of event-years. However, the power of the tests is low for earnings 
management of economically plausible magnitudes. When the models are applied to samples of 
firm-years experiencing extreme financial performance, all models lead to misspecified tests. In 
this respect, our results highlight the conditions under which misspecified tests are likely to arise. 
However, we hasten to add that establishing the extent to which the results of an existing study 
are misspecified requires a detailed reexamination of that study (e.g., Holthausen et al.’s 1995 
reexamination of Healy 1985). Finally, we find that a modified version of the model developed 
by Jones (1991) provides the most powerful tests of earnings management. 

The findings in this study provide three major implications for research on earnings 
management. First, regardless of the model used to detect earnings management, the power of the 
tests is relatively low for earnings management of economically plausible magnitudes. Subtle 
cases of earnings management in the order of, say, one percent of total assets require sample sizes 
of several hundred firms to provide a reasonable chance of detection. Our analysis has focused 
primarily on documenting the properties of existing models. Further research to develop models 
that generate better specified and more powerful tests will further enhance our ability to detect 
: earnings management.!é 


t6 Preliminary work in this direction is conducted by Beneish (1994). 
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TABLE 7 


Comparison of Tests for Earnings Management Based on Alternative Models to 
Measure Discretionary Accruals. Percentage of Firms that are Alleged by the SEC to 
have Overstated Earnings for which the Null Hypothesis of No Earnings Management 

is Rejected (One-Tailed Tests). Sample of 32 Firms that are Targeted by the SEC in 
Accounting and Auditing Enforcement Releases (AAERs) between 1982 and 1992. 


Model Test level of 5% Test level of 1% 
Healy Model: : T 
t-test 12.596 6.396 
DeAngelo Model: 

t-test 9.4 0.0 
Jones Model: T os 
t-test 18.8 6.3 
Modified Jones Model: a - 
t-test 28.1 12:5 
Industry Model: 

t-test 18.8" 9.4" 


* Significantly different from the specified test level at the 5 percent level using a two-tailed binomial test. 
“Significantly different from the specified test level at the 1 percent level using a two-tailed binomial test. 


Second, if the earnings management partitioning variable is correlated with firm perfor- 
mance, then tests for earnings management are potentially misspecified for all of the models 
considered. Pertinent measures of firm performance include earnings performance and cash from 
operations performance. Two recommendations can be made when facing this problem. First, the 
researcher can evaluate the nature of the misspecification and conduct a qualitative assessment 
of how it affects statistical inferences. For example, the nature of the performance-related bias 
may be such that the coefficient on the earnings management partitioning variable is negatively 
biased, while the researcher's hypothesis predicts a positive coefficient. Thus, if the researcher 
finds a significant positive coefficient, it would be reasonable to conclude that the hypothesis is 
supported, since the misspecification works against finding the result. Second, the researcher can 
attempt to directly control for the performance related misspecification. Possible approaches 
include the use of a control sample (e.g., Healy 1985), inclusion of firm performance in the 
earnings management regression (e.g., DeAngelo et al. 1994) or some other form of analysis of 
variance that controls for firm performance (e.g., Holthausen et al. 1995). 

Finally, it is important to consider the relation between the context in which earnings 
management is hypothesized and the model of nondiscretionary accruals that is employed, 
because the model of nondiscretionary accruals may unintentionally extract the discretionary 
component of accruals. For example, if the Jones Model is used in a research context where 
discretion is exercised over revenues, then it is likely to extract the discretionary component of 
total accruals. Similarly, if the Industry Model is used in a research context where intra-industry 
correlation in discretionary accruals is expected, then it is likely to extract the discretionary 
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component of total accruals. Consideration of the sample details should help avoid the use of a 
model of nondiscretionary accruals that unintentionally extracts discretionary accruals. 
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I. INTRODUCTION 


persistent concern across a variety of accounting settings is that practitioners canjustify 
A overly aggressive reporting decisions by exploiting the latitude inherent in the vague 

language used in professional standards.! Recently, standard setters have attempted to 
make it more difficult for practitioners to justify aggressive reporting decisions by making more 
stringent the language used to define the thresholds at which alternative disclosures (or 
measurements) should be made. Standard stringency is increased by changing the level and/or 
increasing the precision of the threshold denoted by a standard? 

For such modifications of professional standards to be effective in reducing the aggressive- 
ness of reporting decisions, at least two necessary conditions must be met. The first condition is 
that an incentive to report aggressively must influence reporting decisions (at least in part) by 
influencing the interpretation of vague standards. Unless practitioners use the latitude provided 
by a vague standard when making an aggressive reporting decision, modifying the standard to 
reduce that latitude will not constrain practitioners' reporting decisions. The second condition is 
that when a more stringent, precise standard is in place, practitioners must not compensate for 
losing the vague standard's Jatitude by assessing more liberally the evidential support for their 
preferred position. Otherwise, reporting decisions under the more stringent standard may be as 
aggressive as they were under the less stringent standard. Neither of these necessary conditions 
have been investigated previously, yet both must be satisfied for recent modifications of 
professional standards to be effective in mitigating the aggressiveness of reporting decisions. 

We examined each necessary condition in a separate experiment. Both experiments were set 
in a tax context to facilitate manipulation of practitioners' incentives. In both experiments, tax 
managers from Big 6 firms interpreted a professional standard, assessed evidential support, and 
made a reporting decision. 

Experiment 1 examined whether practitioners use the latitude inherent in a vague standard 
to support aggressive reporting decisions. Therefore, subjects were provided either an incentive 
to report aggressively or conservatively and a practice standard which employed a vague, verbal 
threshold. Results indicate that, as hypothesized, subjects who had an incentive to report 
aggressively made more liberal interpretations of the standard than those who had an incentive 
to report conservatively. Further, aggressive-incentive subjects' interpretations of the standard 
were liberal enough to justify their aggressive reporting decisions. Thus, the first condition 
necessary for modifications of professional standards to mitigate the aggressiveness of reporting 
decisions is satisfied. 

In experiment 2, we replaced the vague, verbal threshold of the experiment 1 standard with 
a more stringent, numerical threshold. The purpose of experiment 2 was to examine whether 
practitioners would compensate for the loss of latitude in interpreting the vague standard by 
interpreting more liberally the evidence supporting their preferred reporting position, such that 
aggressive reporting decisions are still supported under the more stringent standard. Therefore, 


! For example, the IRS Director of Practice alleged that tax preparers recommended excessively aggressive positions to 
their clients by interpreting liberally the “substantial authority" standard (Shapiro 1987). Similarly, the General 
Accounting Office has alleged that bank managers and auditors avoid recognizing contingent losses by defining 
“probable” to mean “virtually certain" when applying standards like SFAS No. 5 and SAS No. 58 (GAO 1991). 

? For example, the IRS now enforces a “realistic possibility" standard in place of the "substantial authority" standard, and 
both the IRS (Treas. Reg. section 1.6694-2(bX1)) and the American Bar Association (opinion 85-352) define the 
"realistic possibility" standard numerically as a one-in-three chance of success. Similarly, the Financial Accounting 
Standards Board (FASB) implemented critics’ suggestions to use “more likely than not" in place of "probable" in several 
recent standards, and even defined “more likely than not" numerically (paragraph 17 of SFAS No. 109). 
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all subjects in experiment 2 were provided an incentive to report aggressively and one of two 
standards that were more stringent than the standard used in experiment 1. The judgments and 
decisions of these subjects were compared with those of subjects in experiment 1 to assess the 
effect of standard stringency. (Subjects were assigned randomly across experiments to enable this 
comparison.) Results indicate that, as hypothesized, subjects given a more stringent standard 
used the latitude available in assessing evidential support to justify an aggressive reporting 
decision. This shift in incentive effect was pronounced enough to render reporting decisions made 
under the more stringent standards as aggressive as reporting decisions made under the less 
stringent standard. Thus, the second condition necessary for modifications of professional 
standards to mitigate the aggressiveness of reporting decisions is not met. 

l These results indicate that, when practitioners have an incentive to report aggressively, 
modifications designed to make professional standards more stringent may prove relatively 
ineffective in reducing the aggressiveness of practitioners’ reporting-decisions. Although the 
basic decision process examined in this study may be the same in tax, financial reporting, and 
auditing, there are fundamental differences in professional responsibilities and penalty structures 
across these contexts. Therefore, because our experiment is set in a tax context, future research 
should test the generality of these results to financial reporting and auditing contexts. 

The rest of this paper proceeds as follows. Section II presents background and hypotheses 
relevant to each of the two conditions necessary for modifications in professional standards to 
mitigate aggressive reporting. Sections III and IV present experiments 1 and 2, which test the first 
and second conditions, respectively. Section V provides an overall discussion of results and 
opportunities for future research. 


II. BACKGROUND AND HYPOTHESES 


The Effect of Incentives on Interpretations of Vague Standards 


The choice among alternative reporting positions often is specified unambiguously by 
professional standards. For example, the tax code states unambiguously that municipal bond 
interest income is not taxable, and SFAS No. 5 states unambiguously that gain contingencies 
should not be recognized. However, in some circumstances professional standards do not specify 
unambiguously the appropriate reporting position. In such circumstances, practitioners may have 
an incentive to select a reporting position that portrays events most favorably (see, e.g., Johnson 
1993 and Ronen and Sadan 1981 in tax and financial reporting contexts, respectively). For 
purposes of this paper, a practitioner is said to have made an "aggressive" reporting decision if 
the practitioner selects the reporting position that portrays events favorably when that position is 
not indicated clearly by the facts and relevant professional literature. 

One function of professional standards is to constrain the degree to which practitioners can 
report aggressively. This constraint is typically achieved by requiring that the evidential support 
for an aggressive reporting position meet some threshold in order to avoid penalty should the 
position later be questioned. For example, IRC section 6694 requires that an undisclosed tax 
position be taken only if it has a "realistic possibility" of being sustained if litigated. Similarly, 
SFAS No. 5 only permits a practitioner to avoid accruing an estimable, material contingent loss 
if the likelihood of the loss is deemed "reasonably possible" as opposed to "probable." 
Practitioners know they may have to justify their reporting decisions in the future, and that the 
potential for penalties? depends on whether they are seen as applying standards appropriately. 


* Penalties might include fines, litigation, censure, loss of reputation, and loss of license to practice. 
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The ability of professional standards to constrain aggressive reporting may be diminished 
when vague expressions like “realistic possibility” and “probable” are used to denote the 
thresholds at which alternative disclosures should be made. Specifically, practitioners with a 
sufficient incentive to report aggressively might adopt liberal interpretations of the vague 
thresholds denoted by such expressions in order to justify their preferred reporting position. 
Ceteris paribus, the more liberal the interpretation of the standard, the more likely the evidence 
will support an aggressive reporting position. 

Prior research indicates that such latitude in interpretation exists,* but no prior research has 
determined whether practitioners use the latitude available in interpreting vague professional 
standards to justify aggressive reporting positions. In accounting/auditing, Jiambalvo and Wilner 
(1985) attempted to examine the influence of auditors’ incentives on their interpretation of SFAS 
No. 5 and disclosure decisions. However, Jiambalvo and Wilner’s manipulation of auditors’ 
incentives (by varying client preferences) was unsuccessful in influencing any variables, so they 
could not investigate this issue. Jiambalvo and Wilner did provide evidence that auditors’ 
interpretations of professional standards influence their disclosure judgments. In tax, Johnson 
(1993) provides evidence that, when acting as an advocate for an aggressive client, tax preparers 
use the latitude available in interpreting evidential support to justify an aggressive reporting 
decision. Johnson also found that the incentive to act as an advocate influences reporting decisions 
in ways other than through assessment of evidential support. She hypothesized (ex post) that one 
way that this incremental effect might occur is through interpretation of the tax practice standard, 
but could not test this hypothesis. 

Thus, while prior research indicates that latitude exists in interpreting the vague language 
used in some professional standards, that interpretations of vague standards influence reporting 
decisions, and that incentives influence reporting decisions through assessments of evidential 
support, no prior research has demonstrated that incentives affect reporting decisions through 
interpretations of vague standards. Yet, a necessary condition for modifications of vague 
professional standards to mitigate the aggressiveness of reporting decisions is that the effect of 
incentives on reporting decisions must occur (at least in part) through interpretations of the 
standard. Therefore, we test the following hypotheses: 


H1A: Practitioners with an incentive to make an aggressive reporting decision will interpret 
more liberally a vague professional standard than will practitioners with an incentive 
to make a conservative reporting decision. 


H1B: Practitioners who interpret liberally a vague professional standard are more likely to 
make an aggressive reporting decision than are practitioners who interpret conserva- 
tively the vague professional standard. 


The focus in H1A and H1B on interpretation of professional standards is not intended to 
preclude an effect of incentives on reporting decisions in other ways? Rather, our focus is on 


* Amer et al. (1994a), Reimers (1992), Harrison and Tomassini (1989), and Chesley (1986, 1979) examined the mean 
and variance of interpretations of the probability phrases used in professional standards. Amer et al. (1994b) examined 
the influence of event base rate on interpretations of vague standards. Raghunandan et al. (1991) and Schultz and 
Reckers (1981) examined the influence of materiality on SFAS No. 5 reporting decisions. None of these studies 
examined whether incentives for aggressive reporting influence interpretations of professional standards, or examined 
the relationship between interpretations of professional standards and reporting decisions. 

* For example, a plausible alternative mechanism by which individuals might respond to an incentive to report 
aggressively is by increasing their willingness to risk penalty, i.e., "playing the audit lottery" (Beck and Jung 1989). 
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interpretation of professional standards because support for H1A and H1B would satisfy the first 
condition necessary for changes in professional standards to mitigate the aggressiveness of 
reporting decisions. These hypotheses are tested in experiment 1. 


The Effect of Changes in Standards on Reporting Decisions 


If experiment 1 indicates that incentives influence reporting decisions through interpreta- 
tions of vague professional standards, then changes in professional standards might mitigate the 
effect of incentives to report aggressively. Specifically, professional standards could be rendered 
more stringent by increasing the precision and/or changing the level of the threshold that they 
convey. By removing the opportunity for practitioners to interpret liberally the standard (and, 
hence, removing one avenue by which incentives influence reporting decisions), the effect of 
incentive on reporting decisions may be reduced. 

Priorresearch has not addressed this question. Regarding increases in precision, related work 
provides conflicting results. Prior research in psychology has found little difference in decisions 
based on information in the form of probabilities stated verbally (i.e., imprecisely) versus 
numerically (i.e., precisely). (See Wallsten 1990 for a review.) In contrast, prior research in 
auditing reports that the use of verbal versus numerical response modes result in differences in 
the level and/or consensus of auditors' inherent- and control-risk assessments (Reimers et al. 
1993; Dilla and Stone 1993; Stone and Dilla 1994). No research has examined the effectiveness 
of vague versus precise standards in reducing the extent to which incentives influence decisions 
made with respect to those standards. 

Theoretical research in economics and finance suggests that neither increases in the precision 
nor the level of vague standards will mitigate the extent to which practitioners make overly 
aggressive reporting decisions (Finnerty 1988; Kane 1981, 1977). Rather, such changes in 
regulation may merely encourage innovations that are designed to achieve through other means 
the outcomes that are consistent with the decision makers' incentives. Thus, even under a more 
stringent standard, incentive effects on reporting decisions may occur. Our hypothesis is: 


H2: Practitioners with an incentive to make an aggressive reporting decision will likely 
make an aggressive reporting decision, even when the effect of incentive on interpre- 
tation of standard is precluded. 


Innovation could take two forms. First, new ways of structuring transactions or presenting 
information could be developed that avoid the requirements of the modified standard. For 
example, third-party guarantors of the residual value of leased assets evolved to avoid the 90 
percent threshold imposed by SFAS No. 13 (Pulliam 1988). Also, financial instruments are 
developed to avoid or allow specific accounting treatments (Finnerty 1988). Our subjects were 
not allowed to make innovations of this type. Second, practitioners might compensate for the 
diminished latitude of the professional standard by exploiting more fully the latitude available in 
assessing the evidential support for the position favored by their incentives. The effect of such a 
shift would be to increase the degree to which assessments of evidential support favor the 
aggressive position under a more stringent standard. So long as assessments of evidential support 
influence the reporting decision, more liberal assessments will increase the degree to which the 
aggressive position is favored.® Our hypotheses are: 


$ An analogous innovation is demonstrated by Kachelmeier and Messier (1990) in their investigation of the effects of a 
decision aid on auditor sample size judgments. Kachelmeier and Messier observe that sample sizes calculated by 
(continued on next page) 
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H3A: Practitioners under a more stringent standard will interpret the evidence more liberally 
than will practitioners under a less stringent standard. 


H3B: Under a stringent standard, practitioners who interpret the evidence liberally are more 
likely to make an aggressive reporting decision than are practitioners who interpret the 
evidence conservatively. 


When viewed in combination, support for H2, H3A and H3B would indicate that the second 
condition for modifications of professional standards to mitigate aggressive reporting may not be 
satisfied. The shift in incentive effect from interpretation of standard to assessment of evidential 
support may be so great as to render reporting decisions under a more stringent standard as 
aggressive as reporting decisions under a less stringent standard. Hypotheses H2, H3A and H3B 
are tested in experiment 2. 


III. EXPERIMENT 1: THE EFFECT OF INCENTIVES ON INTERPRETATIONS 
OF VAGUE STANDARDS 


Method 


Subjects 

Tax preparers generally see their role as that of client advocate (IRS 1987; Kinsey 1987; 
Coyne 1987), and therefore have an incentive to recommend reporting positions that are more or 
less aggressive as the client wishes (Johnson 1993; Cloyd 1993; Klepper and Nagin 1991). This 
incentive has been legitimized by the AICPA (AICPA 1991), and acknowledged by preparers in 
surveys conducted by the Internal Revenue Service (IRS 1987). Thus, the incentives of tax 
preparers to report aggressively or conservatively can be operationalized through a short client 
description.’ Therefore, we set our experiments in a tax context, and examined the reporting 
decisions of experienced tax preparers.® 

A total pool of 138 tax managers from five Big 6 firms was recruited. The managers were 
assigned randomly to one of the two experiments and to one of the conditions within each 
experiment, with the constraint that the managers from each firm be distributed evenly across 
experimental conditions. Descriptive information about the participants (determined by re- 
sponses to debriefing questions) is shown in table 1. 


Footnote 6 continued from previous page 
decision aids when auditors only provide input parameters are larger than the sample size judgments of auditors who 
use the decision aid to help them judge sample sizes. This result suggests that the latter group of auditors circumvented 
the decision aid by starting with a desired sample size and "backing into" the decision aid parameters that would produce 
the desired sample size. 

7 A tax preparer always has an incentive to implement a client's preferred reporting position. Thus, a preparer’ s incentive 
to adopt an aggressive vs. conservative reporting position can be manipulated by varying the wishes of the preparer's 
client. 

* Auditors weigh the costs and benefits of alternative actions (Knapp 1985; Roberts and Cargile 1994). Thus, like tax 
practitioners, there might be situations where auditors yield to their clients’ wishes. For example, concerns over client 
retention coupled with low engagement risk might align auditors' incentives with their clients' incentives. However, 
auditors’ incentives may be more difficult to manipulate experimentally than tax practitioners’ incentives, because the 
circumstances that compromise professional ethics and auditor independence are difficult to emulate in the laboratory. 
This concern may explain the inability of Jiambalvo and Wilner (1985) to manipulate auditors' incentives by varying 
their client's disclosure preferences. 
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TABLE 1 
Characteristics of Subjects (by Experiment) 
Characteristic Experiment 1 Experiment 2 Statistical 
Significance 
Average number of 
months experience 
in tax practice 112.2 105.8 F=.50, p=.48 
(40.7)! (30.6) 

Average age 33.4 33.1 F=,07, p=.79 

(4.5) (3.7) 
Percentage of managers 
with experience handling 
defamation suit settlements 21.2% 16.1% X23.27. p=.60 
Client advocacy? 2.7 2.9 F=.37, p=.54 

(1.5) (1.6) 
Percentage of 
female subjects 24.2% 43.396 X?z2.6, p=.11 
! Standard deviation. 


4 Mean rating to a question that elicited preparers’ perception of their relative responsibility to the client versus the 
government when the appropriate reporting position is not indicated unambiguously by evidence (1=totally to the client 
and 11=totally to the government). 


The experiment 1 materials were distributed to 72 tax managers. Thirty-four tax managers 
returned the case materials, yielding a response rate of 47 percent.? Subjects had an average of 
over nine years of tax practice experience. 


Task and Materials 

Subjects were provided with a description of a tax issue, a comprehensive synopsis and copies 
of the relevant court cases, Internal Revenue Code sections, and Treasury Regulations that 
comprise the authority relevant to the issue, a description of a client, and a fictitious practice 
standard. The issue could be dealt with aggressively (i.e. excluding amounts from taxable income) 
or conservatively (i.e., including amounts in taxable income). The subjects' task was to determine 
whether to recommend to the client the aggressive or conservative reporting position. 

The issue concerned the taxability of proceeds obtained from the settlément of a defamation 
of character lawsuit. The tax that would result from inclusion of the proceeds was described as 
material to the client. The issue was selected because the existing authority indicated that both the 
aggressive and conservative reporting positions are potentially supportable.!? 


? The responses of one subject were not sensible, and were dropped from all analyses. This participant responded that the 
"reasonable likelihood" standard required a 100 percent likelihood of support for the aggressive position, rated the 
support for thc aggressive position as 50 percent, yet recommended the aggressive position, even though the subject had 
an incentive to report conservatively. 

I? See case five of Ayers et al. (1989) for another use of this basic issue to operationalize a case with ambiguous tax 
consequences. 
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The synopsis of the relevant authority was drafted in close cocperation with four tax partners. 
The synopsis mimics the style and level of detail of narratives typically found in the workpapers 
of the participating firms. The tax partners believed that the case was realistic, the synopsis was 
a complete and accurate representation of the relevant authority, and that the appropriate tax 
treatment was ambiguous. 

The “realistic possibility” standard in effect currently for tax preparers was not used because 
we were concerned that subjects’ responses would be influenced by guidelines provided by their 
firms, the American Bar Association, and the Internal Revenue Service.!! Instead, the fictitious 
"reasonable likelihood" standard was used because we anticipated that its interpretation would 
be close enough to subjects’ assessments of evidential support to allow an incentive effect on 
interpretation to result in observable differences in reporting decisions. The standard stated that 
penalties would not be imposed if a position had at least a "reasonable likelihood" of being 
supported in court if later litigated. As described in more detail subsequently, subjects did not 
disregard the fictitious standard in favor of the standard currently in place, and subjects indicated 
that their ability to complete the case study was not adversely affected by using a fictitious 
standard. The "reasonable likelihood" standard is shown in appendix A. 


Dependent Variables 

Three dependent measures were elicited from each subject: (1) a reporting decision, (2) an 
assessment of evidential support (hereafter called a "support assessment"), and (3) an interpre- 
tation of the practice standard. The reporting decision was dichotomous, requiring subjects to 
recommend that tbe client either exclude the proceeds of the defamation settlement from taxable 
income (the aggressive position) or include the proceeds (the conservative position). The support 
assessment was the percentage score that subjects assigned to the likelihood that excluding the 
proceeds of the defamation suit settlement from taxable income would be supported by the courts 
if litigated. The interpretation of the practice standard was the percentage score that subjects 
assigned to the minimum support for an aggressive position that was required by the practice 
standard. The three questions used to elicit the dependent variables are shown in appendix B. 


Independent Variables 

Two independent variables, incentive and order, were crossed in a 2 x 2 between-subjects 
design. 

Tax preparers' incentives to make a conservative or aggressive reporting decision were 
manipulated between subjects by using two different client descriptions. A conservative (aggres- 
sive) incentive was provided by describing the client as a conservative (aggressive), risk-averse 
(risk-taking), knowledgeable taxpayer who believes it makes good business sense to adopt a 
conservative (aggressive, yet legitimate) posture in the resolution of gray areas of the tax law. To 
encourage the preparers to attend to the client's preferences, the client was always described as 
important, with the client's fees comprising a material component of the preparer's practice 
revenues, and the client enhancing the preparer's practice by referring business associates to the 
preparer for tax work. 

All subjects were also assigned to one of two orders of the three dependent variables (all 
dependent variables were elicited after viewing the issue, authority, client description and 
practice standard). Half of the subjects first recommended a tax reporting position, then assessed 
evidential support, and then interpreted the practice standard. The other half first interpreted the 
standard, then assessed evidential support, and then recommended a position. This order 


1! See Bandy et al. (1993) for a description of the evolution of the realistic possibility standard, its application, and extant 
guidelines. 
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manipulation allowed us to examine whether the results can be explained by unintended 
dependencies in the measures of the dependent variables. The variable "order" was never 
significant when included as an independent variable in any analysis.? Therefore, we collapse 
over the order manipulation when reporting all analyses. 


Procedure | 

The experimental materials were mailed directly to the tax manager participants.” Included 
in each packet was a cover letter addressed to the subject stating that the purpose of the study is 
to understand better how tax practitioners recommend reporting positions to their clients when 
the tax law is ambiguous, a letter of endorsement from a contact person in the subject's firm, the 
case materials, and a pre-addressed, stamped return envelope. Subjects completed the task 
whenever they had a sufficient block of time available and returned the completed matertals 
directly to us. The mean reported time to complete the case study was 36.6 minutes (standard 
deviation 12.5). | 


Results 


Preliminary Analyses and Manipulation Checks 

Data from debriefing questions indicate that the case materials were effective in depicting an 
ambiguous, material tax issue affecting an important client. Subjects viewed the defamation 
settlement issue as somewhat ambiguous: their mean rating of the degree to which the tax law 
indicates clearly the appropriate treatment for the settlement proceeds was 5.3 (standard deviation 
1.9) on an 11 point scale (1=appropriate treatment very unclear to 1 1—appropriate treatment very 
clear). Subjects viewed the settlement proceeds as material: their mean rating of the materiality 
of the award was 8.1 (standard deviation 1.6) on an 11 point scale (1—very immaterial to 1l=very 
material). Subjects viewed the client as important: their mean rating of client importance was 8.6 
(standard deviation 1.9) on an 11 point scale (1—very unimportant to 11=very important). 

Data from a debriefing question indicated that our manipulation of incentives through 
descriptions of the client as either aggressive or conservative was successful. Subjects in the 
aggressive incentive condition viewed their client as more aggressive than did subjects in the 
conservative incentive condition. On an 11 point scale (1=very conservative, 11—very aggres- 
sive), the mean rating in the conservative condition of 2.6 (standard deviation 1.3) and the mean 
rating in the aggressive condition of 8.2 (standard deviation 2.3) differ significantly (F(1,31)-75.30, 
p«.001, £22—.692). 

In order to examine how practitioners respond to incentives to report aggressively, the 
incentive manipulation must influence reporting decisions. Table 2 presents the mean, median, 
and standard deviation of subjects' interpretations of the standard, support assessments, and 
recommended reporting decisions for experiment 1. Nineteen percent of subjects assigned to the 
conservative incentive condition chose the aggressive disclosure, while 88 percent of subjects 
assigned to the aggressive incentive condition chose the aggressive disclosure. The decisions are 
analyzed in a univariate logistic regression with incentive as the independent variable. The effect 
of preparer incentive is significant (y7(1)=12.41, p<.001, hit rate 84.996), indicating that the tax 


7 Because of the relatively small sample sizes used in both experiments one and two, tests may lack sufficient power to 
detect small order effects. 

P One firm would not provide the names and addresses of tax managers, but did agree to identify 30 tax manager 
participants, randomly allocate those participants to the two experiments, distribute the case materials we provided, 
collect the completed case materials, and return them to us. 
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TABLE 2 


Sample Sizes, Interpretations of Standard, Evidential Support Assessments, 
and Reporting Decisions for Each Condition in Experiment 1 


Preparer Practice Sample 

Incentive Standard Size Standard! Support Decision? 

conservative reasonable - 

client likelihood 16 56.4 48.4 3 
(51/12.9) (50/21.7) (18.896) 

aggressive reasonable 

client likelihood 17 44.1 53.3 15 
(41/10.1) (55/12.4) (88.2%) 


1 Mean (median/standard deviation) likelihood specified by the practice standard. 

? Mean (median/standard deviation) likelihood that the aggressive reporting position would be supported by the courts 
if litigated. 

* Number (percentage) of tax managers who recommended that the client take the aggressive reporting position. 


preparers recommended the aggressive reporting position more to aggressive clients than to 
conservative clients. These results indicate that the incentive manipulation was successful. 

To determine whether subjects disregarded the fictitious “reasonable likelihood” standard in 
favor of the “realistic possibility” standard currently in effect for tax preparers, t-tests are used 
to test the null hypothesis that the participants’ interpretations of the “reasonable likelihood” 
standard are equal to the 1 in 3 interpretation of "realistic possibility” defined by the IRS and the 
ABA. The t-statistics are t(16)=4.5 (p<.001) and t(15)=7.3 (p<.001) in the conservative and 
aggressive conditions, respectively, indicating that subjects did not ignore our fictitious standard 
in favor of the “realistic possibility” standard currently used in practice. Thirty-one of the subjects 
(94 percent) stated in response to a debriefing question that their ability to complete the case study 
was not influenced by being required to use a fictitious practice standard. Results of all analyses 
are the same if only these 31 subjects are included. These results indicate that the fictitious practice 
standard was not disregarded, and that applying a fictitious as opposed to real standard did not 
influence subjects’ ability to complete the case materials. 


The Effect of Incentive on Standard Interpretation and Support Assessment 

H1A would be supported if the interpretations of the standard by subjects who have an 
aggressive incentive are lower (i.e., indicating a more liberal threshold) than the interpretations 
by subjects who have a conservative incentive. As shown in table 2, the mean interpretation of 
the standard in the aggressive and conservative incentive conditions is 44.1 and 56.4, respec- 
tively. When analyzed in a one-way ANOVA with incentive as the independent variable, this 
difference is significant (F(1,31)=9.48, p=.004, Q7=.205). This result supports H1A. Subjects’ 
interpretations of the vague standard were influenced by their incentives. 

Practitioners might also respond to incentives to report aggressively by adopting liberal 
assessments of evidential support. This incentive effect on support assessments would be 
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indicated if the support assessments by subjects who have an aggressive incentive are higher 
(providing a more liberal interpretation of the evidence) than the support assessments by subjects’ 
who have a conservative incentive. As shown in table 2, the mean support assessment is 53.3 and 
48.4 in the aggressive and conservative incentive conditions, fespeeuvely. This difference is in 
the direction of an incentive effect. However, when analyzed in a one-way ANOVA with 
incentive as the independent variable, the difference is not significant (F(1,31)—.63, p=.433, 
£(2——.011). 


The Effect of Standard Interpretation and Support Assessment on Reporting Decisions 

H1B would be supported if subjects who interpreted the standard as providing a low threshold 
are more likely to recommend the aggressive reporting position than are subjects who interpreted 
the standard as providing a high threshold. When analyzed in a univariate logistic regression with 
reporting decision as the dependent variable, interpretation of standard is significant (%7(1)=4.13, 
p=.042, hit rate 78.7%). This result supports HIB. The lower a subject's interpretation of the 
standard, the more likely that the subject recommended the agpressive reporting position. 

Reporting decisions may also be related to support assessments. When analyzed in a separate 
logistic regression with reporting decision as the dependent variable, support assessment is 
marginally significant (x7(1)—3.25, p=.072, hit rate 72.7%). The higher a subject’s support 
assessments, the more likely that the subject recommended the aggressive reporting position.'* 

A low interpretation of the standard (and/or high assessment of evidential support) does not 
necessarily justify a decision with respect to the professional standard. Rather, the interpretation 
must be low enough (and/or the support assessment must be high enough) to comply with the 
simple decision rule specified by the standard. According to this decision rule, practitioners 
whose interpretation of the standard is less (more) than their support assessment should 
recommend the aggressive (conservative) position. We can determine whether our subjects made 
reporting decisions that appear justified with respect to this decision rule by comparing the 
decisions predicted by the decision rule with those actually made. Table 3 reports the results of 
that comparison. Eliminating from the analysis the four subjects whose standard interpretation 
equalled their support assessment, 24 of the remaining 29 subjects’ recommendations (a hit rate 
of 83 percent) are consistent with the decision rule, a proportion similar to that observed in prior 
research in audit judgment and juror decision making. This result is statistically significant 
(x2(1)—12.52, p«.001), and provides additional support for H1B because it indicates that subjects’ 
reporting decisions are related to their standard interpretations. It also indicates that subjects' 
reporting decisions relate to their standard interpretations and support assessments in a manner 
that justifies their recommendations with respect to the decision rule implied by the professional 
standard. This latter implication is important to determining the more stringent standards used in 
experiment 2. 


4 Similar results are obtained by other logistic regressions. With both interpretation of standard and support assessment 
included in the model, both are significant (for interpretation of standard, x?(1)—4.12, p=.043; for support assessment, 
32(1)—2.95, p=.086) in explaining reporting decisions. With only the difference between interpretation of standard and 
support assessment included in the model, the difference is significant (y7(1)=6.17, p=.013) in explaining reporting 
decisions. In both cases, the classification hit rate was 78.896. 

I5 This simple decision rule explains well the decisions made by participants in a variety of prior studies. The decision rule 
holds for 84 percent of the auditors! going-concern opinions made with respect to the "substantial doubt" threshold in 
Asare (1992), for 80 percent of auditors’ contingency disclosures made with respect to the "reasonably possible" and 
“probable” thresholds of SFAS No. 5 in Jiambalvo and Wilner (1985), and for 80 percent of jurors’ verdicts made with 
respect to the "reasonable doubt" threshold in Marshall and Wise (1975). None of these studies examined whether 
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TABLE 3 
Predicted Versus Actual Reporting Decisions: Experiment 1 


Actual Reporting Decision 
Predicted Reporting Decision! Conservative? Aggressive? 
Conservative 12 3 
Aggressive 2 12 
Total^ 14 | 15 


! The predicted reporting decision is conservative (aggressive) if the tax manager’s assessment of the evidential support 
for the aggressive reporting position was greater (less) than his/her interpretation of the evidential support required by 
the professional standard. i 

2 Number of tax managers who recommended that the client take the conservative reporting position. 

3 Number of tax managers who recommended that the client take the aggressive reporting position. 

4 Four tax managers are treated as missing values for this analysis because their interpretations of the standard equalled 
their support ratings. 


IV. EXPERIMENT 2: THE EFFECT OF CHANGES IN STANDARDS 
ON REPORTING DECISIONS 


The results ofexperiment 1 indicate that practitioners use the latitude available in interpreting 
vague standards to justify aggressive reporting positions, satisfying the first condition necessary 
for modifications of professional standards to mitigate the aggressiveness of reporting decisions. 
Experiment 2 investigates the second necessary condition: that, when the effect of incentive on 
standard is precluded, the incentive effect that occurred through the interpretation of the standard 
in experiment 1 does not shift and occur instead through some other component of the reporting 
decision problem, e.g., through assessments of evidential support. 


Method 


Overview 

Subjects participating in experiment 2 performed the same task as that performed by subjects 
in experiment 1. The purpose of experiment 2 was to determine whether a more stringent (i.e., 
higher and more precise) standard mitigates the effect of an aggressive incentive on reporting 
decisions. Therefore, all subjects were provided the aggressive incentive used in experiment | and 
one of two standards that were more stringent than that used in experiment 1. Because subjects 
were assigned randomly between experiments, we could compare the responses of subjects given 
these more stringent standards to the responses of subjects given a less stringent standard in 
experiment 1 to assess the effect of standard stringency. 
Subjects 

Recall that a total pool of 138 tax managers from five Big 6 firms was recruited, and that 
subjects were assigned randomly across both experiments and to one of the cells within each 
experiment. The experiment 2 materials were distributed to 66 tax managers. Thirty-one tax 


incentives for aggressive reporting influence interpretations of professional standards. 
16 Because of staff turnover at one firm, four sets of materials could not be distributed. 
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managers returned the case materials, yielding a response rate of 47 percent. Descriptive 
information about subjects (determined by responses to debriefing questions) is shown in table 
1. These data indicate that the groups of subjects assigned to experiments 1 and 2 are very similar. 
The two groups do not differ significantly in months of experience, age, experience handling 
defamation suit settlements, inclination to act as a client advocate, and gender." These results 
suggest that the random assignment was effective in producing very similar groups of subjects 
in the two experiments. | 


Task and Materials 

Except for the tax practice standard (described subsequently), and the fact that all subjects 
assigned to experiment 2 received the aggressive incentive, the task and materials were the same 
as in experiment 1. 
Dependent Variables 

The three dependent measures elicited from each subject were the same as in experiment 1. 


Independent Variables 

Data from experiment 1 were used to select the two stringent standards used in experiment 
2. One of the stringent standards was the 55 percent standard, which is equal to the median support 
assessment made by experiment 1 subjects in the aggressive incentive condition (see table 2). The 
55 percent standard is shown in appendix A. The 55 percent standard is more stringent in two 
respects than the “reasonable likelihood” standard used in experiment 1. The 55 percent standard . 
sets a higher probability threshold than the “reasonable likelihood” standard (55 percentis greater 
than either the mean (44 percent) or median (41 percent) of the “reasonable likelihood" standard 
: in the aggressive incentive condition), so it is more difficult for a given set of evidence to justify 
an aggressive reporting decision. The 55 percent standard is also more precise than the 
"reasonable likelihood" standard (55 percent is numerical while "reasonable likelihood" is 
verbal), diminishing the latitude available in interpreting the standard. 

Any relatively high numeric standard would be more stringent than the "reasonable 
likelihood" standard. We chose the 55 percent standard because it also provides a benchmark 
against which to assess the aggressiveness of reporting decisions in experiment 2. Recall that 
experiment 1 demonstrated that subjects tend to make decisions that are consistent with the 
decision rule implied by the professional standard. Also recall that a 55 percent standard equals 
the median support assessment of subjects in the aggressive incentive condition in experiment 1 
(1.e., 55 percent; see table 2). Thus, under a 55 percent standard, only half of subjects in experiment 
2 should recommend the aggressive reporting position, unless incentives affect decisions through 
some other component of the decision problem when the standard is rendered more stringent. 

The other precise standard used in experiment 2 was the 60 percent standard. The 60 percent 
standard (shown in appendix A) was selected by examining a stem and leaf plot of the support 
assessments of experiment 1 subjects in the aggressive incentive condition. Two subjects 
provided support assessments of 55 percent. Four subjects provided support assessments of 60 
percent. This suggests that the median support assessment in experiment 1 may be unstable. That 
is, a shift of two subjects could produce a median of 60 percent. This potential instability of 


7 Although the difference is not statistically significant at conventional levels (p.11), the groups of participants differ 
somewhat in proportion of male versus female subjects. Prior research has shown that male taxpayers (Vogel 1974; 
Mason and Calvin 1978; Spicer and Becker 1980; Spicer and Hero 1985) and male tax preparers (Cuccia 1994) tend 
to be more aggressive and risk seeking than female taxpayers and tax preparers. The higher proportion of female 
participants in experiment 2 than in experiment 1 would therefore tend to render less aggressive the reporting decisions 
of subjects in experiment 2, and thus introduces a bias against supporting our hypotheses. 
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median is a concern because it suggests that a 60 percent median support assessment might be 
obtained in experiment 2 without any increase in the degree to which incentives influence support 
assessments. Therefore, to insure that results under the 55 percent standard cannot be explained 
by instability of median support assessments across experiments, we also examined decisions 
made with respect to a 60 percent standard." 

Our interest is in comparing performance under each stringent standard to performance under 
the vague "reasonable likelihood" standard, not in comparing performance under the alternative 
stringent standards. The 60 percent standard is simply a more conservative version of the 55 
percent standard that allows us to be confident that results supporting hypotheses in experiment 
2 are not attributable to an unstable median support assessment. 

All subjects were also assigned to the same two orders of the dependent variables used in 
experiment 1. As in experiment 1, order was never significant when included as an independent 
variable in any analysis. Therefore, we collapse over the order manipulation when reporting all 
analyses. 


Procedure 

Experiment 2 was administered two months after experiment 1, because data from experi- 
ment ] were used to determine the more stringent standards used in experiment 2. The same 
procedure was used in both experiments. As in experiment 1, the mailing of materials to subjects 
(with cover letters addressed to each subject) ensured that the same subjects did not participate 
in both experiments. Also, subjects provided the last six digits of their social security number. No 
two were the same. The mean reported time to complete experiment 2 was 38 minutes (standard 
deviation 13.6). The time to complete the experiment did not differ significantly between 
experiments 1 and 2 (F(1,61)=.17, p.681, Q?=-.013). 


Results 


Preliminary Analyses and Manipulation Checks 

None of the results of preliminary analyses or manipulation checks differed significantly 
from those obtained in experiment 1. As in experiment 1, subjects viewed the case materials as 
depicting an ambiguous, material tax issue and an important client. Subjects' mean rating of the 
degree to which the tax law indicates clearly the appropriate treatment for the settlement proceeds 
was 5.3 (standard deviation 2.0) on an 11 point scale (1=appropriate treatment very unclear to 
11=appropriate treatment very clear). Subjects’ mean rating of the materiality of the award was 
8.3 (standard deviation 2.0) on an 11 point scale (l=very immaterial to 11—very material). 
Subjects’ mean rating of client importance was 9.3 (standard deviation 1.8) on an 11 point scale 
(1-very unimportant to 11=very important). 

Data from a debriefing question indicate that, like subjects receiving the aggressive incentive 
in experiment 1, subjects in experiment 2 viewed their client as aggressive. On an 11 point scale 
(l=very conservative, 11—very aggressive), the mean rating was 9.0 (standard deviation 1.5). 

In addition to being more precise (by having thresholds communicated numerically), the 55 
percent and 60 percent standards were also intended- to communicate a higher probability 
threshold than the "reasonable likelihood" standard used in experiment 1. Subjects in the 
aggressive condition in experiment 1 interpreted the "reasonable likelihood" standard as denoting 


i8 The 55 percent median from experiment 1 might also be unstable in the opposite direction, with subjects in experiment 
2 generating a median support assessment of 50 percent. This case is not a source of concern, because support 
assessments of less than 55 percent would bias results away from supporting H2 and H3A. 
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a mean threshold of 44.1%, while subjects in experiment 2 recognized that the 55 percent and 60 
percent standards denoted thresholds of 55 percent and 60 percent, respectively (see table 4). T- 
tests are used to examine whether interpretations of the “reasonable likelihood” standard are 
significantly less than 55 percent and 60 percent. The t-statistics are ((16)=4.5 (p<.001) and 
t(16)=6.5 (p«.001), respectively, indicating that the “reasonable likelihood" standard communi- 
cated a lower (less stringent) threshold than either the 55 percent standard or the 60 percent 
standard. Thirty ofthe subjects (98 percent) stated thattheir ability to complete the case study was 
not influenced by being required to use a fictitious practice standard. Results of all analyses are 
the same with only these 30 subjects included in the analyses. 


The Effect of Incentive on Reporting Decision 

Table 4 presents the mean, median, and standard deviation of subjects’ support assessments 
and decisions under the stringent standards, as well as those of subjects who participated in the 
aggressive incentive condition under the vague standard used in experiment 1. As discussed 
previously in the method section, H2 would be supported if greater than 50 percent of subjects 
recommend the aggressive position under the 55 percent and 60 percent standards. Table 4 
indicates that 81 percent and 80 percent of the managers in the 55 percent standard and 60 percent 
standard conditions, respectively, recommended the aggressive disclosure. Binomial tests are 
used to test H2. The null hypothesis is that the likelihood of recommending the aggressive 
disclosure is equal to 50 percent. The p-values are .002 and .004 in the 55 percent standard and 
the 60 percent standard conditions, respectively. This result supports H2: incentives to report 
aggressively influenced reporting decisions, even when the effect of incentive on interpretation 
of standard observed in experiment 1 was precluded. 

A logistic regression across all of the aggressive incentive conditions is used to examine the 
relative aggressiveness of disclosure decisions. The dependent variable is reporting decision, and 
the independent variable is standard stringency (coded as two levels, either low [the “reasonable 
likelihood" standard] or high [the 55 percent and 60 percent standards]). The effect of standard 
stringency is not significant (47(1)-.45, p=.504). This result indicates that subjects’ reporting 
decisions were not significantly less aggressive under the 55 percent and 60 percent standards 
than they were under the "reasonable likelihood" standard, even though the 55 percent and 60 
percent standards were higher and more precise. Given an aggressive incentive, subjects made 
aggressive disclosures, regardless of the level and precision of the standard. 


The. Effect of Incentive on Support Assessments 

H3A would be supported if support assessments under the 55 percent and 60 percent 
standards are higher than those under the "reasonable likelihood" standard. The mean support 
assessments were 53.3, 62.5, and 65.3 in the "reasonable likelihood," 55 percent, and 60 percent 
standard conditions, respectively (see table 4). The support assessments are analyzed in a one- 
way ANOVA across all ofthe aggressive incentive cells with standard as an independent variable. : 
The main effect of standard is significant (F(2,45)=4.19, p=.022, Q*=.117) as are the contrasts 
between the “reasonable likelihood" standard and the 55 percent standard (F(1,45)=4.57, p=.038, 
(22—.066) and the “reasonable likelihood” standard and the 60 percent standard (F(1,45)=7.47, 
p=.009, Q7=.120). This result supports H3A. Support assessments were higher under the more 
stringent 55 percent and 60 percent standards than under the less stringent "reasonable likelihood" 
standard. 
The Effect of Support Assessments on Reporting Decisions 

H3B would be supported if support assessments are positively related to the likelihood that 
a subject would choose the aggressive reporting position. When the experiment 2 data were 
analyzed in a univariate logistic regression with reporting decision as the dependent variable, 
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TABLE 4 


Sample Sizes, Interpretations of Standard, Evidential Support Assessments, 
and Reporting Decisions for Each Aggressive-Incentive Condition 


Practice Sample 

Experiment Standard Size Standard! Support Decision 

1 reasonable 17 44.1 53.3 15 
likelihood (41/10.1) (55/12.4) (88.2%) 

2 55% standard 16 55.0 62.5 13 
(55/0) (62.5/12.2) (81.2%) 

2 60% standard 15 60.0 65.3 12 
(60/0) (70/12.5) (80.0%) 


! Mean (median/standard deviation) likelihood specified by the practice standard. 

1 Mean (median/standard deviation) likelihood that the aggressive reporting position would be supported by the courts 
if litigated. 

* Number (percentage) of tax managers who recommended that the client take the aggressive reporting position. 


support assessment is significant (77(1)=5.35, p=.021, hit rate 90.3%)."° This result supports 
H3B. The higher a subject’s support assessments, the more likely that the subject recommended 
the aggressive reporting position. 

As in experiment 1, to determine whether the subjects participating in experiment 2 made 
reporting decisions that appear justified with respect to the professional standard they were given, 
we compared the reporting decisions predicted by the decision rule stipulated by the standard with 
the reporting decisions that subjects actually made. Table 5 reports the results of that comparison. 
Eliminating from the analysis the two subjects whose standard interpretations equalled their 
support assessments, 27 of the remaining 29 subjects’ recommendations (a hit rate of 93 percent) 
are consistent with the decision rule. This result is statistically significant (y7(1)=19.859, p<.000), 
and is consistent with the results obtained in experiment 1. This result provides additional support 
for H3B because it indicates that subjects’ reporting decisions are related to their support 
assessments. It also indicates that, as in experiment 1, subjects’ reporting decisions are related to 
their standard interpretations and support assessments in a manner that justifies their recommen- 
dations with respect to the decision rule implied by the professional standard. 


19 Because the interpretation of a precise threshold is a constant, logits that contain both interpretation of standard and 
support assessment as the independent variables, or that contain the difference between interpretation of standard and 
support assessment as the independent variable, all yield the same result as the univariate logit containing only support 
assessment as the independent variable. 
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TABLE 5 
Predicted Versus Actual Reporting Decisions: Experiment 2 


Actual Reporting Decision 
Predicted Reporting Decision! Conservative Aggressive? 
Conservative 6 2 
Aggressive 0 21 
Total = e n Bo 


! The predicted reporting decision is conservative (aggressive) if the tax manager’ s assessment of the evidential support 
for the aggressive reporting position was greater (less) than his/her interpretation of the evidential support required by 
the professional standard. 

ł Number of tax managers who recommended that the client take the conservative reporting position. 

3 Number of tax managers who recommended that the client take the aggressive reporting position. 

“Two tax managers are treated as missing values for this analysis because their interpretations of the standard equalled 
their support ratings. 





V. DISCUSSION 


As mentioned previously, there are two conditions necessary for more stringent professional 
standards to be effective in reducing the aggressiveness of reporting decisions. The first condition 
is that the influence on reporting decisions of an incentive to report aggressively must occur (at 
least in part) by influencing the interpretation of a vague standard. As hypothesized, the results 
of experiment 1 suggest that this condition is satisfied. The second necessary condition is that 
when a more stringent standard is in place, practitioners must not compensate for losing the 
latitude available in interpreting a vague standard by assessing more liberally the evidential 
support for their preferred position. As hypothesized, the results of experiment 2 suggest that this 
condition is not satisfied. When an effect of incentives on interpretation of standard was precluded 
by the stringent standards used in experiment 2, tax preparers adopted more liberal interpretations 
of evidential support. Thus, by shifting the influence of incentives from their interpretation of 
standards to their assessments of evidential support, preparers made reporting decisions under 
more stringent standards that were not significantly less aggressive than reporting decisions made 
under a less stringent standard. 

These results are consistent with a growing literature which examinés the effects of 
justification and accountability requirements on judgments and decisions in accounting set- 
tings.” This literature indicates that people often engage in justification or other “defensive 
bolstering” behavior when accountable to others for their actions (Gibbins and Emby 1984; 
Tetlock 1985), and that the importance of justification increases when an action is counter to the 
preferences of whomever holds a decision maker accountable (Messier and Quilliam 1992; 


72 See Beach and Mitchell (1978), Tetlock (1985), and Tetlock et al. (1989) for discussions of the effects of justification 
requirements on judgments and decisions in general settings, and Ashton (1990, 1992), Gibbins and Newton (1993), 
Kennedy (1993), Messier and Quilliam (1992), and Quilliam (1993) for discussions of the effects of justification 
requirements on judgments &nd decisions in accounting contexts. 
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Quilliam 1993). Our subjects consistently justified their decisions with respect to the practice ` 
standards against which the appropriateness of their disclosure decisions would be judged, with 
the particular justification tactic employed depending on the vagueness of the criterion against 
which subjects’ decisions would be held accountable. 

This ability to shift justification tactics may decrease the effectiveness of attempts to mitigate 
the effect of incentives on reporting decisions through modifications of the level and precision 
of professional standards. Obviously, very radical changes could have an effect. For example, a 
“99 percent standard” would probably preclude most aggressive recommendations in our tax 
setting. However, the changes in professional standards enacted recently are not of this nature. - 
As mentioned previously, the ABA and IRS have recently defined the “realistic possibility” 
standard as requiring a one in three chance of success. Likewise, financial accounting standards 
have shifted from “probable” to “more likely than not,” and quantified “more likely than not" as 
“greater than 50 percent.” Subject to the caveats discussed below, our results suggest that such 
relatively minor modifications of practice standards may not mitigate aggressive reporting. 

Our experiments were set in a tax context. Although the basic decision process examined in 
this study may be the same in tax, financial reporting, and audit settings, there are fundamental 
differences in professional responsibilities and penalty structures across these settings. So, care 
should be exercised when generalizing our results to the financial reporting/auditing arena. Future 
research could identify the circumstances in financial reporting and auditing where incentives for 
aggressive reporting are sufficient to obtain the effects observed in this study. 

Care must also be exercised when generalizing the results reported in our paper to other tax 
contexts which have very different levels of uncertainty associated with the appropriate reporting 
position. For example, in some tax contexts, sufficient legal precedent exists to reduce greatly the 
latitude available in interpreting evidential support. Prior research indicates that the amount of 
uncertainty present in the tax context influences tax preparers’ aggressiveness (Alm 1991; Beck 
and Jung 1989; Scotchmer 1989). The less uncertainty that exists in interpreting evidential 
support, the more likely that increasing standard stringency will reduce the frequency of 
aggressive reporting. Future research could test this assertion. 


APPENDIX A 
Practice Standards Used in the Experiment 


Reasonable Likelihood Standard: 


Assume that the practice standard relevant to recommending reporting positions to clients in 
effect at the time you must recommend a tax reporting position is the “reasonable likelihood” 
standard (a fictitious practice standard that we created for this case study). The “reasonable 
likelihood” standard is defined as follows: 


Preparer penalties will be imposed if any part of any understatement of liability with respect 
to any return or claim for refund is due to a position for which there was not a reasonable likelihood 
of being sustained on its own merits. The “reasonable likelihood” standard is less stringent than 


! The wording of the “reasonable likelihood” standard mimica the phrasing of the “substantial authority" standard in 
effect currently for taxpayers. 
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a “reasonably probable” standard, but stricter than a “reasonably doubtful” standard. Thus, a 
position with respect to the tax treatment of an item that is arguable but fairly unlikely to prevail 
in court would satisfy a “reasonably doubtful” standard, but not the “reasonable likelihood” 
standard. 


55% Standard: 


Assume that the practice standard relevant to recommending reporting positions to clients in 
effect at thé time you must recommend a tax reporting position is the “55 percent" standard (a 
fictitious practice standard that we created for this case study). The 55 percent standard is defined 
as follows: 


Preparer penalties will be imposed if any part of any understatement of liability with respect 
to any return or claim for refund is due to a position for which there was not greater than a 55 
percent chance of being sustained on its own merits. 


60% Standard: 


Assume that the practice standard relevant to recommending reporting positions to clients in 
effect at the time you must recommend a tax reporting position is the “60 percent" standard (a 
fictitious practice standard that we created for this case study). The 60 percent standard is defined 
as follows: 


Preparer penalties will be imposed if any part of any understatement of liability with respect 
to any return or claim for refund is due to a position for which there was not greater than a 60 
percent chance of being sustained on its own merits. 


APPENDIX B 
Questions Used To Elicit Dependent Variables 


Interpretation of Standard:! 


In the case materials, you were told that, under the applicable practice standard, preparer 
penalties will not be imposed if a position has a reasonable likelihood of being sustained on its 
own merits. Please quantify the threshold conveyed by the "reasonable likelihood" standard as 
it applies to this case. To have a “reasonable likelihood" of being sustained, the recommendation 
needs a probability of being sustained if litigated of at least 


% 


1 The wording of the standard interpretation question was changed to be consistent with the standard a subject was given. 
For example, subjects who received the 55 percent standard read that “... penalties will not be imposed if a position has 
a 55 percent chance of being sustained on its own merits." 
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Assessed Evidential Support: 


Suppose that the entire amount of the settlement is excluded from income on the return and 
is later challenged by the IRS. Based solely on the legislative, administrative and judicial 
authority provided, what is the probability that the exclusion of the settlement would be supported 
by the courts if litigated? 

— © 


Reporting Decision: 


If a tax-reporting position must meet the “reasonable likelihood” standard before it can be 
recommended to a client, and no other legislative, administrative, or judicial authority exists that 
is relevant to your decision other than that provided, would you recommend that your client treat 
the settlement as excludable under Sec. 104(a)(2)? Assume that excluding the settlement from 
income with disclosure is not an option. (check one) 





a. Yes - recommend excluding the settlement from income. 
b. No - recommend including the settlement in income. 
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ABSTRACT: This research uses an experimental methodology to examine the 
“curse of knowledge” in judgment and the extent to which it is mitigated by 
accountability, experience, and counterexplanation. The curse of knowledge occurs 
when individuals are unable to (appropriately) disregard information already pro- 
cessed. Important audit implications of the curse of knowledge arise in going concem 
evaluation and analytical review. Experiments 1 and 2 examine these two contexts 
with both auditors and MBA students. Results show significant curse of knowledge 
effects among both auditors and MBA students. These effects are not mitigated by 
accountability, consistent with Kennedy’s (1993) deblasing framework. A third 
experiment finds that counterexplanation (explaining why a particular outcome might 
not occur) does eliminate the curse of knowledge. 
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I. INTRODUCTION 


HIS paper investigates the "curse of knowledge" in judgment and the extent to which it 

is mitigated by accountability, experience, and explicit counterexplanation. The curse of 

knowledge occurs when, in predicting others' knowledge or forecasts, individuals are 
unable to ignore knowledge they have that others do not have (Camerer et al. 1989) or when they 
are unable to disregard information already processed (Fischhoff 1977). The curse of knowledge 
has been demonstrated in a number of audit contexts. For example, Biggs and Wild (1985), Heintz 
and White (1989), Kinney and Uecker (1982), and McDaniel and Kinney (1994) show that 
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auditors’ preliminary analytical review judgments are prone to this bias. Buchman (1985) finds 
this bias in bankruptcy predictions and Reimers and Butler (1992) show it in auditors’ internal 
control judgments and opinion qualification decisions. 

Implications of the curse of knowledge arise in at least two audit contexts. First, auditors who 
are aware of their client’s unaudited book values may unknowingly direct their expectations 
toward those book values while performing preliminary analytical review and thus fail to detect 
significant “unexpected” changes in accounts—the objective of preliminary analytical review.’ 
Implications for audit efficiency and effectiveness may be substantial because analytical review 
is used: (1) in the planning stage of audits to set audit objectives and design tests; (2) to obtain 
evidence about the reasonableness of account balances; and (3) at the audit’s conclusion as an 
overall evaluation of presentation fairness and the reasonableness of the financial statements as 
a whole (Arens & Loebbecke 1988). Second, with respect to going concern evaluation, the 
perceived culpability of auditors who do not modify reports of clients that subsequently fail may 
be greater in the eyes of peers, shareholders, the SEC, expert witnesses, and jurors, all of whom 
have knowledge (that the client did indeed fail) that the auditor did not have when the report 
decision was made (Lowe and Reckers 1994). 

This paper is concerned with debiasing the curse of knowledge. Kennedy (1993) proposed 
a general debiasing framework that focused on the source of bias. According to her framework, 
biases are predominantly effort related or (internal or external) data related. These classifications, 
described in section II, determine the debiasing prescription. Prior research related to the curse 
of knowledge, discussed in section IIT, suggests that this bias is likely more data related than effort 
related. 

Experiments test three debiasing mechanisms: accountability, experience, and 
counterexplanation. Accountability and experience are examined in the first two experiments. It 
is important to test the debiasing potential of experience and accountability first because both are 
already part of the audit environment. Audit firms institutionalize accountability through the 
review process (Kennedy 1993; Tan 1994). Accountability is an effort-inducing incentive and 
should mitigate effort-related biases. Familiarity with the task through experience may also 
mitigate biases, particularly if they are due to lack of knowledge or experience in evaluating 
evidence. Smith and Kida (1991) reviewed heuristics and biases research in the audit judgment 
literature and found that the extent of bias is often less when auditors perform job-related tasks. 
If these natural mechanisms are ineffective, then investigation of other debiasing mechanisms is 
warranted. The third experiment examines one such mechanisrn-—counterexplanation—which 
involves explicitly considering evidence that would not support or lead one to expect the outcome 
that occurred (Butler 1985; Heiman 1990; Koonce 1992). Counterexplanation should mitigate 
curse of knowledge effects because it addresses the cognitive nature of this bias. Specifically, it 
weakens causal connections between evidence and the actual outcome (Lipe 1991). Alternative 
outcomes are made more salient with counterexplanation. 

Curse of knowledge effects are found among both auditors and MBA students in experiments 
using both going concern and analytical review type tasks. Accountability is ineffective in 
mitigating these effects. Further tests reveal that successful debiasing can be accomplished 
through counterexplanation, a concept consistent with auditor skepticism. The experimental 
results are consistent with the debiasing framework presented in section II and in Kennedy (1993). 


! It is inappropriate to include the unaudited book value in an expectations model for that outcome for the purposes of 
identifying unexpected fluctuations. However, once the expectation is formed unaudited book values are used to identify 
unexpected fluctuations. For a formal analysis of the consequences associated with the use of book value in preliminary 
analytical review, see Wild and Biggs (1990). 
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The contributions of this research are both theoretical and practical. First, it extends previous 
research by demonstrating the curse of knowledge across two subject groups and two audit- 
related tasks for which there are nontrivial implications of the curse of knowledge. This research 
also shows that, although often positive in their effects on judgment, accountability and 
experience do not automatically improve judgment. Rather, their effectiveness depends on the 
source of judgment error. Other, more effective mechanisms, such as counterexplanation, 
become apparent when the source of judgment error is examined. 

The remainder of this paper is organized as follows: Section II provides a debiasing 
framework that considers the source of judgment bias and possible remedies. Section III provides 
a formal definition of the curse of knowledge and reviews relevant psychological literature. 
Experiments are provided in sections IV, V, and VI. Section VII summarizes and concludes. 


II. 4 DEBIASING FRAMEWORK 


The following debiasing framework focuses on sources of judgment bias (see figure 1). In 
the spirit of traditional audit judgment research, its emphasis is on improving judgment. Judgment 
quality is a function of effort and data (see also Kennedy 1993). Effort has two components: 
capacity and motivation. I assume that capacity is sufficient or can be augmented if required and 
focus only on motivation and data. 

Judgment quality may be compromised when an individual is not motivated to supply the 
requisite attention and cognitive effort. The obvious means to improve judgment quality in this 
case is to provide the judge with incentives so that the benefits associated with additional attention 
and effort outweigh the costs. Such incentives include accountability and monetary incentives. 
The relationships between effort, incentives, and performance are not new to audit judgment 
research. Ashton (1990) proposed a framework in which incentives increase pressure which in 
turn increases attention and effort. Performance improves until a threshold is reached after which 
additional pressure is counterproductive. Libby and Lipe (1992) also stress the contingent nature 
of incentives. They maintain, consistent with the framework here, that performance improves 
only if effort increases and the cognitive processes involved are sensitive to increased effort. 

Judgment quality may also be compromised when data (internal or external) are poor. 
Internal data refers to the judge’s knowledge stored in memory. External data refers to 
information or signals from the environment. Possible remedies for poor internal data include, but 
are not limited to, refreshing the judge’s memory, retraining, providing decision aids, and 
replacing the judge with one more knowledgeable. External data is poor when it is incomplete, 
obscured by irrelevant data, or presented in a form that undermines its usefulness. Possible 
remedies include, but are not limited to, searching for more data, elaborating, clarifying and 
restating existing data, or eliminating irrelevant data.* The major distinction between internal and 
external data-related biases is simply the origin of the troublesome data (memory versus 
environment). Obviously, once external data is “ingested” it becomes internal, at least tempo- 
rarily, and biased processing may occur in the same way that it would had the data originated in 
memory. This distinction is potentially useful for debiasing prescriptions because some remedies 
possible with external data are impossible with internal data, e.g., removing or preventing access 
to the offending data. 


? This framework is similar to, but was developed independent of, a debiasing framework by Arkes (1991). He classified 
errors as strategy-based, association-based, and psychophysical-based. Strategy-based errors are similar in spirit to 
effort-related biases, while the other two classifications are similar in spirit to data-related biases (internal or external). 
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FIGURE 1 
Improving Judgment Quality 
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This paper, as stated previously, examines alternatives for debiasing the curse of knowledge. 
The success of these alternatives depends on the root of the problem, i.e., whether the curse of 
knowledge is a matter of effort, internal data or external data. If the curse of knowledge is not effort 
related, it is external data-related because outcome knowledge is provided rather than recalled. 
The obvious remedy then is not tọ provide the data in the first place (McDaniel and Kinney 1994).? 
However, this remedy may not be available in the audit contexts concerned with here. For 
example, outcome knowledge regarding a failed firm is what instigates litigation against auditors 
who did not modify the audit report for going concern. Potential remedies do include finding 
individuals who can ignore outcome knowledge, or intervening in the process to discount or 
counteract outcome knowledge. The nature of the curse of knowledge and its debiasing potential 
are described in the following section. 


III. THE CURSE OF KNOWLEDGE 


The curse of knowledge can be expressed formally as a violation of the normative law of 
iterated expectations (Camerer et al. 1989; Chow and Teicher 1978). Consider a random variable 
X. Forecasts of X depend on information sets available to the forecaster. Assume there are two 
information sets, S, and S,, where S, is a subset of S,. A forecaster with information set 8, knows 
everything that a forecaster with information set S, knows, plus more. À forecast of X, given the 
information set S is E(XIS,). A forecaster with “information set S, who wants to predict the 
forecast based on S, estimates E[E(XIS,)IS,]. Formally, the curse a knowledge means that 
E[ECXIS,)IS,] is not equal to ECXIS,), contrary to the law of iterated expectations. The forecaster 
with information set S, overestimates the scope of S, so that E[ECXIS, JIS, ] is somewhere between 
ECXIS,) and E(XIS,). A simple model that Camereret al. (1989) used to test the curse of knowledge 
is: 


E[E(XISJIS,] = w EXIS) + (1-w) EXIS) (1) 


If w = 0, the law of iterated expectations is supported; if w = 1, the curse of knowledge is 
supported. Therefore, the parameter w measures the degree of the curse of knowledge. 

The curse of knowledge is closely related to hindsight bias, which holds that people 
consistently exaggerate what could have been anticipated in foresight (Fischhoff 1975). People 
even misremember their own original predictions, so as to exaggerate in hindsight what they 
actually knew in foresight (Fischhoff and Beyth 1975; Hell et al. 1988). The hindsight bias is a 
special, temporal, case of the curse of knowledge. Another variation of the same underlying 
phenomenon, the “knew it all along” effect, occurs when people who have been informed of 
correct answers to questions are asked to respond as they would have responded had they not been 
told the answers (Fischhoff 1977). Individuals afflicted by the “knew it all along” effect 
overestimate how well they would have performed by assigning higher probabilities to answers 
reported to be correct. 

A review of cognitive explanations for the hindsight bias and the “knew it all along” effect 
(and therefore the more general curse of knowledge) suggests that this bias is more data-related 
than effort-related.4 For instance, Fischhoff (1977) argues that upon hearing or reading an 


* Note this remedy is not available for internal data-related biases because spontaneous recall cannot be prevented. 

* Motivational explanations for the hindsight bias—involving individuals’ ego-involvement or self-presentational 
motives——have also been proposed (Leary 1981; Campbell and Tesser 1983), but the empirical support is weak and these 
explanations seem incomplete at best. Experiments that have varied instructions in order to eliminate these motives still 
find significant bias (e.g., Fischhoff 1975; Wood 1978). Connolly and Bukszar (1990) specifically test for a self- 
presentation motive versus a cognitive error explanation for hindsight bias. Their results support a’ emt account 
for hindsight bias. 
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outcome, individuals integrate this information with what else they know about that topic in an 
attempt to create a coherent whole out of all relevant knowledge. Hawkins and Hastie (1990) 
propose that when individuals make judgments, they review evidence from the environment or 
from long-term memory. Selective mechanisms, not under conscious control, operate such that, 
once an outcome is known, outcome-congruent evidence is more accessible than outcome- 
incongruent evidence. Individuals evaluate the selected evidence by creating a mental model of 
the causal relations among the evidence items. Outcome knowledge provides an opportunity for 
learning and induces the individual to develop and alter generic models of causal relations. Thus, 
hindsight bias is the "dark side" of successful learning and judgment since the mechanisms 
deemed responsible for the bias are the same mechanisms for adaptive learning and proficient 
judgment in a natural environment (Hawkins and Hastie 1990). 

Attempts to debias these effects by entreating subjects to work hard and warning subjects 
about the bias have been largely ineffective (Fischhoff 1982; Wood 1978). However, debiasing 
is not impossible. Hasher et al. (1981) successfully eliminate hindsight bias by telling subjects 
that the "correct" information already incorporated into memory is wrong. Hindsight bias has also 
been reduced with techniques that increase attention to explanations of the outcome(s) that did 
not occur, thus increasing the availability of the nonreported outcome(s) (Davies 1987; Hell et 
al. 1988; Slovic and Fischhoff 1977). Similarly, Brown and Solomon (1987) and Reimers and 
Butler (1992) attenuate such effects by instructing subjects to consider worst possible outcomes 
before they receive outcome knowledge. Finally, Camerer et al. (1989) find that individuals 
cannot ignore private information even when monetary incentives and feedback are provided. 
However, they do provide evidence that a market setting reduces (but does not eliminate) this bias. 
They attribute the market's effectiveness to more frequent trading by less biased traders who 
somehow realize that they are less biased. 

Although somewhat effective, these debiasing mechanisms are not natural to most judgment 
settings. Two potential debiasing mechanisms that are natural to the audit judgment setting are 
accountability and experience. Accountability, defined as the requirement to justify one's 
judgments when called upon, encourages people to exert additional cognitive effort (Tetlock 
19852; McAllister et al. 1979). To the extent that biases are due to insufficient effort, accountabil- 
ity should mitigate bias (Kennedy 1993). In a series of experiments, Tetlock found accountability 
reduced primacy in an impression-formation task (Tetlock 1983), reduced over-attribution in an 
essay-attribution task (Tetlock 1985b), and reduced overconfidence in a personality-prediction 
task (Tetlock and Kim 1987). Kennedy (1993) found accountability eliminated recency in a 
going-concern evaluation task. She also predicted, but did not test, that accountability would be 
ineffective in mitigating data-related biases. Based on cognitive explanations for the curse of 
knowledge discussed earlier in this section (Hawkins and Hastie 1990; Fischhoff 1977), and the 
debiasing attempts of others (particularly Camerer et al. 1989), the curse of knowledge seems 
more data related, a priori, and therefore unlikely to be mitigated by accountability. That is, 
however, an empirical question which this paper hopes to address. 

Although generally positive in its effects, accountability can also exacerbate bias. Tetlock 
and Boettger (1989) showed that accountable individuals provide more regressive judgments 
than nonaccountable subjects when they are given irrelevant information in addition to relevant 
information. Accountable subjects overinterpreted irrelevant information in an effort to “leave no 
stone unturned." The same effect may prevail with the curse of knowledge. Additional cognitive 
effort, if motivated, may be devoted to rationalizing or making sense of the reported outcome and 
strengthening associative links with reasons supporting the reported outcome, rather than to 
ignoring the outcome knowledge. 


Kennedy—Debiasing the Curse of Knowledge in Audit Judgment 255 


Smith and Kida (1991) reviewed heuristics and biases research in the audit judgment 
literature and concluded that the extent of judgment bias is often less when experienced auditors 
perform job-related tasks. Several explanations for this exist. First, experience influences both the 
amount of knowledge and the way that knowledge is organized, either of which can mitigate bias 
or simply change the nature of the bias.> Knowledge can make a task more familiar and therefore 
less complex which may counteract effort-related biases. For example, based on Hogarth and 
Einhorn’s (1992) belief revision model, Kennedy (1993) hypothesized that in a going concern 
evaluation task auditors would use an end-of-sequence processing mode while MBA subjects 
would use a less effortful step-by-step processing mode. The latter processing mode results in 
recency while the former does not, according to the model. Consistent with the model, she found 
recency among MBA students but not auditors. However, when MBA students were made 
accountable for their judgments they did not exhibit recency, a result consistent with employing 
the more effortful end-of-sequence processing. Two points that weaken Kennedy’s (1993) 
conclusions are that auditors may simply exert more effort than MBA students, independent of 
knowledge, and that data-related remedies were not tested against recency. 

Second, many of the studies reviewed by Smith and Kida (1991) provide evidence that 
auditors attend preferentially to negative information or outcomes. Smith and Kida suggest that 
auditors adopt “specialized” heuristics in response to the substantial risks associated with many 
audit judgments. Indeed, standards of field work require auditors to maintain an attitude of 
professional skepticism (SAS 53, AICPA 1989). Possibly, the skepticism adopted by auditors 
through training and experience protects them from the curse of knowledge, i.e., auditors are 
trained to think about why the reported outcome would not be expected. However, prior auditing 
research argues against auditors’ spontaneous counterexplanation. Both Heiman (1990) and 
Koonce (1992) found that auditors’ beliefs regarding hypothesized causes for unexpected 
fluctuations in analytical review tasks decrease when auditors are given other explanations to 
consider, are required to generate other explanations, or are asked to counterexplain (consider 
why a given hypothesis might not be correct). Therefore, as with accountability, the effect of 
experience on the curse of knowledge is an empirical question which this paper hopes to address. 

The following two experiments investigate whether accountability or experience can 
mitigate the curse of knowledge in two separate and important audit contexts. Expectations are 
that neither will be effective due to the cognitive nature of this bias. A third experiment 
investigates counterexplanation which should eliminate curse of knowledge effects since it does 
deal with the cognitive aspects of this bias. Specifically, it breaks the strong causal connections 
between evidence and outcome by prompting judges to consider why that particular outcome was 
unlikely (Lipe 1991). 


IV. EXPERIMENT 1: GOING CONCERN EVALUATION 


Method 


Subjects and Procedure 
There were two subject groups. The first was 147 MBA students at Duke University who had, 
on average, 5.8 years of business-related experience. The second was 161 auditors attending 


* Frederick and Libby (1986) provide an example of how both students and auditors exhibit the conjunction fallacy in 
different but predictable ways due to their differential knowledge and the way that knowledge is organized. Auditors 
associated errors with internal control weakness while students simply associated error accounts in judging conjunctive 


probabilities. 
* MBA subjects consisted of 102 executive and 45 daytime program students. The results are substantially the same when 
only executive subjects are included although the power of the tests is lower. 
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managers’ training sessions for a Big 6 firm. On average, these subjects had been directly involved 
in four audits in which, as part of an audit team, they judged aclient’s ability to continue as a going 
concern. Subjects in both groups were randomly assigned to treatment conditions. Experience did 
not differ across conditions for either group of subjects. 


Task 

The going concern evaluation task used in Kennedy (1993) was slightly modified for this 
experiment. The task required subjects to estimate what they thought other subjects, who did not 
know the actual outcome, would estimate as the likelihood that a hypothetical firm would fail 
(enter bankruptcy proceedings) based on eight pieces of evidence.’ The eight pieces of firm- 
specific evidence are a subset of those used in Kida (1984) and the same as those used in Kennedy 
(1993). The positive evidence was presented first, followed by the negative evidence.’ The 
estimate of what others, who did not know the outcome, would estimate is the dependent variable. 
The actual outcome is clearly irrelevant to this task. Requiring subjects to put themselves in the 
position of others, who did not know the actual outcome, is important to remove a self-flattering 
incentive to simply estimate the outcome they know to be true. With the task structured this way, 
subjects have no reason to think that using the outcome will make them appear smarter or perform 
better since other subjects, whose responses they are trying to predict, did not have access to this 
information (Fischhoff 1975). 


Independent Variables 

A 3 x 3 x 2 (accountability x outcome knowledge x subject group) between-subjects 
experimental design was used. The three accountability conditions were pre-, post- and no- 
accountability. Pre-accountable subjects were informed that they were accountable before they 
read the evidence while post-accountable subjects were informed thatthey were accountable after 
they read the evidence but before they made their estimate. The post-accountability conditions 
are necessary to rule out an alternative hypothesis that accountability induces conservatism rather 
than cognitive effort. For instance, accountable subjects might not show curse of knowledge 
effects because accountability makes them conservative and unwilling to revise their priors for 
fear of being wrong. If so, it should make no difference whether subjects are told they are 
accountable before or after they see the evidence. But, if accountability motivates cognitive effort 
and the curse of knowledge is effort related, pre-accountable subjects should show less bias than 
post-accountable subjects. Accountable individuals were informed that their responses would be 
reviewed and that they may be selected for a follow-up interview in which they would be asked 
to explain and justify their responses. They were asked to provide their name and phone number 
` so they could be contacted. The three levels of outcome knowledge were failed, continued, and 
no outcome knowledge. Subjects in the failed outcome knowledge condition were told that the 
firm selected at random from a sample of 100 firms, did, in fact, fail. Analogously, subjects in the 
continued outcome condition were told that the firm selected at random did, in fact, remain viable. 
Two subject groups were used to test whether familiarity with the task was an important 
determinant of the extent of this bias.? 


7 Camerer et al. (1989) use this paradigm. To ensure that uninformed subjects can estimate other uninformed subjects’ 
estimates without bias, they had University of Chicago MBA students guess uninformed Wharton MBA students' 
predictions. Their judgments were randomly distributed around the judgments of the uninformed Wharton MBA 
students. Research on knowledge projection indicates that individuals' estimates of what others know is largely 
determined by what they themselves know or think they know (Nickerson et al. 1987). 

5 The particular order of the evidence is not important to this experiment but all subjects must see the evidence in the same 


order, 
? Although the task is not familiar to MBA students it is not unreasonable given their business interests and educations. 
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Hypotheses 

Subjects in the failed (continued) outcome knowledge conditions were told that the selected 
firm did (did not) fail. However, this outcome knowledge is irrelevant since subjects were asked 
to estimate what others, who did not have this information, would estimate. Consistent with the 
curse of knowledge, subjects with failed (continued) outcome knowledge are expected to make 
higher (lower) estimates than those with no outcome knowledge. 

If the cognitive view of hindsight bias is correct, accountability will not mitigate the curse 
of knowledge in subjects’ estimates. Similarly, if curse of knowledge effects are “hardwired” in 
individuals, experience will not mitigate the effect in subjects’ estimates. 


Analysis and Results 


Means and standard deviations for the estimate are in table 1, panels A and B for the auditors 
and MBAs, respectively.!° Results of an ANOVA with planned comparisons are in table 2. 

Subjects given failed (continued) outcome knowledge make higher (lower) estimates than 
those given no outcome knowledge. On average, MBA (auditor) subjects with failed outcome 
knowledge estimate the likelihood of failure 11 (16) points higher than those with continued 
outcome knowledge. See figure 2. Planned contrasts that compare outcome conditions with no 
outcome conditions are significant (t = 3.20, p < .001 and t = 1.81, p < .04, one-tailed, for failed 
and continued outcome knowledge, respectively). Stronger results for failed outcome knowledge 
are consistent with results in psychology that indicate greater hindsight bias for “occurrences” 
than for *nonoccurrences" (Fischhoff 1977; Fischhoff and Beyth 1975; Wasserman et al. 1991). 
Since the base rate of failure is low, failure may be viewed as an event while continuing as a going 
concernis simply maintaining the status quo. Planned comparisons of pre-accountable conditions 
to post- and nonaccountable conditions are not significant (t = .61, p = .54 and t = .97, p = .33, 
respectively).!! Thus, the curse of knowledge exists in this setting and it is not mitigated by 
accountability. Finally, although the means indicate greater curse of knowledge effects for 
auditors than for MBA subjects, the difference is not statistically significant (t = .64, p = .52). 


Discussion 


This experiment finds that subjects—auditors and MBA students—are susceptible to 
outcome knowledge that should be ignored, and that accountability is not an effective debiaser 
of the curse of knowledge. According to the framework in Kennedy (1993), data-related biases 
such as the curse of knowledge are unlikely to be reduced by accountability, and the present results 
are consistent with this framework. While accountability may motivate additional cognitive 
effort, that effort does not necessarily translate into less biased judgment, since the effort may be 
misdirected. 


P Given a limited number of subjects, I wanted to ensure that there was sufficient power to measure both accountability 
and outcome effects if each or either was present. Therefore, roughly twice as many pre-and nonaccountable subjects 
were allocated to the high and low outcome conditions compared to the no outcome conditions. Fewer subjects were 
also assigned to all post-accountable conditions since these conditions were not of direct interest but were included to 
rule out an alternative explanation. The different cell sizes for the MBA subjects are not of great concern because a 
Hartley test does not reject the null hypothesis of homogeneous variances (Winer 1971). 

The marginal accountability x group interaction is a result of the pre-accountable auditors’ judgments being lower on 
average (54.1) than those of preaccountable MBA students (61.3). Auditors’ and MBAs’ estimates did not differ 
significantly in either post-accountable or nonaccountable conditions, but post-accountable auditors’ estimates were 
higher on average (58.5) than those of post-accountable MBA students (53.4). The reason for this cross-over in the 
effects of pre- and post-accountable conditions between MBAs and auditors is unclear but it does not seem to be related 
to the curse of knowledge. 
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_ This result is not likely due to an ineffective accountability manipulation. First, the same 
. manipulation that proved effective in mitigating recency among MBA subjects on a closely 
related task in Kennedy (1993) was used here with similar subjects. Second, post-experimental 
questionnaire responses provide additional evidence of an effective accountability manipulation. 
A MANOVA on subjects’ responses to questions regarding the likelihood that they will be 
contacted regarding their responses, motivation to provide justifiable responses, effort required 
by the task, effort expended, and task difficulty reveals a significant accountability effect (p < 
.001), a group effect (p « .001) and an accountability x outcome x group effect (p « .05). Further 
analyses show that accountable subjects in both subject groups believed they were more likely 
to be contacted regarding their responses than did nonaccountable subjects (t = 7.56, p < .0001). 
All conditions reported high levels of effort required by the task but importantly, pre-accountable 
subjects reported expending more effort than nonaccountable subjects on the task (t = 1.76, p « 
.04, one-tailed). Among auditors this effect is particularly strong. Both pre- and post-accountable 
auditors report expending significantly more effort than nonaccountable auditors (t = 2.63, p « 
.01, and t = 2.47, p « .01, one-tailed, respectively). Further, I counted the number of auditors in 
each accountability condition who noted the evidence items on their materials during the evidence 
presentation. A greater proportion of pre-accountable than post- and nonaccountable subjects 
took notes (p « .07, Fisher's Exact Test (one-tailed)). While not conclusive, note-taking is 
consistent with the goal of being prepared to provide justifiable responses if required. Reported 
effort expended on the task also correlates positively and significantly with subjects' reported 
motivation to provide justifiable responses (r = .43, p « .01), although the ANOVA revealed no 
significant accountability effects for the latter variable. 

It is also conceivable that auditors are implicitly accountable in a laboratory setting due to 
carry-over effects from the audit environment. Consistent with this argument, auditors' responses 
to the effort-related questions in the post-experimental questionnaire were significantly greater 
than those of MBA students. Specifically, auditors reported spending more effort on the task (t 
= 8.54, p « .0001), rated the task as requiring more effort (t = 4.92, p = .0001), rated the task as 
more difficult (t = 7.42, p « .0001), and were more motivated to provide justifiable responses (t 
= 2.31. p « .02) than MBA students. Thus, if the curse of knowledge is due to insufficient effort 
we should see less curse of knowledge effects with auditor subjects. In fact, auditors seem to 
exhibit greater curse of knowledge effects, although the difference is not significant. Finally, a 
power analysis reveals that the power of the test was adequate (between .80 and .90) to detect 
differences in accountability conditions if such differences existed.? From these tests it appears 
that, if the curse of knowledge is not mitigated by accountability, the reason is not an ineffective 
accountability manipulation.?? 

^X. A second experiment, following, tests the robustness of the curse of knowledge in another 
important audit setting. A preliminary analytical review task is used and, again, the potential of 
accountability and experience to debias the curse of knowledge are examined. 


2The power tests considered an alpha level of .05, differences of five in likelihood judgments to be of practical 
importance, and used a variance of 400 (20?) to be conservative (Winer 1971). 

B The ideal way to rule out an ineffective accountability manipulation is to give the same subjects another task for which 
accountability is expected to work. Experiment 2 of this paper accomplishes this objective by using the same MBA 
subjects as in Kennedy (1993) who performed a belief revision task for which accountability was expected to be (and 
was) an effective debiaser of recency. 
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TABLE 2 
Experiment 1 
Analysis of Variance 
Dependent Variable: Likelihood Estimate" 
Source of Variance DF F-statistic P-value 
Accountability 2 0.49 61 
Outcome 2 14.27 01 
Group 1 0.06 80 
Accountability x Outcome 4 0.56 69 
Accountability x Group 2 2.56 08 
Outcome x Group 2 0.36 70 
Accountability x Outcome x Group 4 0.61 66 
Error 290 
Total 307 
Planned Comparisons DF t-statistic P-value? 
Fail versus no outcome conditions. l 3.20 01 
Continued versus no outcome conditions. l 1.81 04 
Pre-accountable versus post-accountable. 1 0.61 54 
Pre-accountable versus non-accountable. 1 0.97 33 


* Likelihood estimate is subjects' estimate of what others would judge as the likelihood that this firm would fail. 

^ Pre- (post-) accountable subjects were informed they were accountable for their estimates before (after) they saw 
evidence. Nonaccountable subjects were informed that their responses were confidential and not identifiable. 

* Subjects in the fail (continue) conditions were told that the firm failed (continued) to exist. Subjects in the none condition 
were not given any outcome knowledge. 





å One-tailed. 

V. EXPERIMENT 2: ANALYTICAL REVIEW TASK 
Method 
Subjects and Procedure 


As in experiment 1, there were two subject groups: 86 executive MBA students from Duke 
University and 322 auditors attending manager training sessions for a Big 6 firm.“ Post- 
experimental questionnaire responses reveal no statistically significant differences across condi- 
tions for either subject group with respect to reported forecasting knowledge or business-related 
experience. 

The experiment was conducted in a group setting. The subjects received a booklet containing 
instructions and the experimental materials and were allowed to progress at their own speed. 


"The auditor subjects included those of experiment 1 and the MBA subjects included those of Kennedy (1993). The same 
subjects held accountable in experiment 1 and Kennedy (1993) were accountable in experiment 2 to avoid carry-over 
effects from the previous experiment. Specifically, subjects who were post-accountable in prior experiments were pre- 
accountable in experiment 2. Combining both pre- and post-accountabie subjects from the prior experiments results in 
greater n's for accountability. Subjects not previously accountable were not accountable in this experiment either. 
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Task 

Each subject’s task was to estimate what they thought other subjects, who did not know the 
actual outcome, would estimate as 13th quarter’s unit sales for two different products (X and Y) 
of a hypothetical firm. Since subjects were not asked to estimate the 13th quarter’s unit sales 
directly, the actual outcome was irrelevant to the task at hand. This was necessary here, as in the 
previous experiment, to avoid a motivational source of the curse of knowledge, and make the task 
analogous to that of auditors performing analytical review. In analytical review, auditors are 
motivated to make reasonable predictions that permit the identification and discovery of unusual 
fluctuations or relationships; no premium is associated with predicting unaudited book values. 

The stimuli consisted of two graphs depicting unit sales over the past 12 quarters for two 
different products (X and Y). The trend of the graphs was varied to determine the robustness of 
the curse of knowledge and accountability effects (Heintz and White 1989). The two graphs were 
generated from the same set of data (see figure 3). The points on graph Y are those of graph X 
multiplied by a negative factor, plus a constant. The slopes for X and Y are decreasing and 
increasing, respectively. A linear time series regression on the 12 quarters of unit sales has an R? 
of .736 for each graph; each graph has the same statistical predictability. Each graph was on a 
separate page with the Y graph following the X graph. Subjects responded on the same page that 
the graph appeared. 


Independent Variables and Dependent Variables 

A 2x3 X2 (accountability x outcome knowledge x subject group) experimental design was 
used. The two accountability conditions were pre-accountable and not accountable. Pre-account- 
able subjects were informed that they were accountable for their judgments before they saw the 
graphs. The three levels of outcome knowledge were high, low, and none. Outcome knowledge 
was indicated on the graph by an asterisk. High outcome knowledge subjects saw an asterisk for 
the 13th quarter that was higher than the 12th quarter's unit sales, and low outcome knowledge 
subjects saw an asterisk for the 13th quarter that was lower than the 12th quarter's unit sales. 
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Pretests (with no outcome knowledge provided) suggested the range of points to which subjects 
are likely to extrapolate.’ The high (low) outcomes were placed at least two standard deviations 
above (below) the mean prediction obtained from pretesting, and equidistant from the 12th 
quarter' s observation. 

Three dependent variables were employed: (1) the estimate of what other subjects would 
predict for the 13th quarter unit sales (referred to as the "estimate" and denoted X or Y), (2) the 
likelihood that actual sales would be as high (denoted XH and YH for graphs X and Y, 
respectively) or (3) as low (denoted XL and YL for graphs X and Y, respectively) as the asterisk 
indicated. 

Hypotheses 


The high (low) outcome conditions were provided with the 13th quarter's actual unit sales 
(outcome knowledge) that was either high or low relative to the 12th quarter. This outcome 
knowledge was irrelevant since subjects were asked to estimate what others, who did not have 
this information, would predict. Consistent with the curse of knowledge, subjects with high (low) 
outcome knowledge are expected to make higher (lower) estimates than those with no outcome 
knowledge. Subjects are also expected to judge high (low) outcomes as more likely when high 
(low) outcome knowledge is provided. As in experiment 1, accountability and experience are not 
expected to debias the curse of knowledge. 


Results 


Means and standard deviations for subjects' estimates of the 13th quarter's unit sales and 
likelihood judgments for both graphs are provided in table 3, panels A and B for the auditors and 
MBA students, respectively. A MANOVA on all the dependent variables (X, XH, XL, Y, YH, 
and YL) reveals a significant outcome effect (Wilks' Lambda, F — 8.01, p « .0001) but no 
accountability effect or interactions. The related ANOVAs are in table 4. I discuss the results 
for the estimates (X and Y) first, followed by results for the likelihood judgments (XH, XL, YH, 
and YL). 

For each graph the mean estimates are highest in the high-outcome conditions and lowest in 
the low-outcome conditions. The no-outcome estimates fall between the two. The ANOVAs 
indicate a significant main effect for outcome on graphs X (F = 9.66, p < .0002) and Y (F = 7.33, 
p < .0012). Planned comparisons between outcome conditions and the control condition (no 
outcome) reveal an interesting pattern. Both high and low outcome conditions are significantly 
different from no outcome conditions in the expected direction for graph X (t — 2.08, p « .02 and 
t=4.51,p<.001, one-tailed, respectively). For graph Y, high outcome conditions are significantly 
higher than no outcome conditions (t = 2.63, p « .01, one-tailed) but low outcome conditions are 
only marginally lower than no outcome conditions (t = 1.44, p « .08, one-tailed). These results 
are fairly consistent with Heintz and White (1989) who investigated the influence of outcome 
knowledge in analytical review and found: (1) outcome effects regardless of the direction of the 
data trend, (2) decreasing unaudited values (1.e., outcome knowledge) have greater influence than 
increasing unaudited values, and (3) an unaudited value that represents a trend reversal has greater 


P5 Undergraduate accounting students estimated the 13th quarter' s unit sales for each product. 

The MANOVA also reveals a significant group effect (Wilks’ Lambda F = 6.52, p< .0001). MBA subjects, on average, 
estimated Y higher than did auditors, in all conditions. However, since group did not interact with outcome or 
accountability, inferences regarding these factors are not affected by group. No other factors or interactions were 
significant. 
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TABLE 3 
Experiment 2 
Auditor and MBA Subjects’ Estimates and Likelihood Judgments of 13th 
Quarter’s Sales 
Mean (Standard Deviation) 
Treatment Conditions Estimates Likelihood Judgments * 
Accountability* Outcome’ N X Y XH XL YH 
Panel A: Auditor Subjects 
Accountable High 65 32.7 61.0 26.5 242 24.9 
(5.3) (65 (21.1)  Á (189) (199) 
Accountable Low 64 27.8 56.3 18.2 31.4 16.7 
(6.2) (6.0) (182) (22.0) (163) 
Accountable None 66 30.5 59.1  . 23.5 20.9 18.7 
(6.5) (6.1) (193) (163) (14.9) 
Nonaccountable High 41 33.8 58.4 34.0 20.5 18.8 
(5.2) (5.8) (23.1) (185 (180) 
Nonaccountable Low 43 27.2 56.8 18.1 33.0 16.7 
(5.6) (100) (19.4) (23.3) (20.3) 
Nonaccountable None 43 31.4 58.1 24.4 20.7 14.0 
(4.6) (5.2) (83) (16.9) (102) 
Panel B: MBA Subjects 
Accountable High 19 31.1 65.8 27.0 20.2 28.8 
(5.9) (4.8) (185) (16.8) (224) 
Accountable Low 19 26.0 61.0 16.6 29.5 21.1 
(4.7) (6.0) (219) (13.3) (15.8) 
Accountable None 20 30.4 61.2 22.1 20.4 21.3 
(5.1) (5.0 (161) (134) (10.4 
Nonaccountable High 9 32.3 64.8 23.3 27.2 30.6 
(5.3) (6.6) (180 (25.9) (28.3) 
Nonaccountable Low 9 26.1 59.7 12.8 25.8 12.0 
(2.3) (5.2) (10.0) (19.7) (8.6) 
Nonaccountable None 10 30.5 61.2 22.0 19.5 14.5 
(4.3) (44) (83  À (167) (13.0) 


* Accountable subjects were informed that they were accountable before they saw the graphs. 
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25.9 
(19.8) 
35.2 
(25.4) 
26.8 
(23.1) 
37.7 
(27.9) 
35.9 
(27.4) 
27.8 
(23.3) 


> Outcome knowledge was indicated by an asterisk that was cither high or low relative to the 12th quarter's sales. 


e X and Y refer to the estimates for the 13th quarter's sales for graphs X and Y, respectively. 


4 XH, XL, YH, and YL refer to likelihood judgments that the 13th quarter sales were as high as the high outcome (XH 


and YH) or as low as the low outcome (XL and YL) indicated on graphs X and Y. 
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influence than an unaudited value consistent with prior periods. In this paper, graph X (Y) is 
generally decreasing (increasing) but the last two quarters in the time series are increasing : 
(decreasing), reversing the previous trend. In graph X (Y), a low (high) outcome, which is a 
reversal of the most recent trend, is most influential. Therefore, results (1) and (3) of Heintz and 
White (1989) replicate here. 

Subjects' likelihood judgments that actual sales are at least as high as the high outcome (XH 
and YH) or at least as low as the low outcome (XL and YL) are also influenced by outcome 
knowledge. For XH and YH, predictions are high outcome > no outcome > low outcome, while 
for XL and YL, predictions are low outcome > no outcome > high outcome. Comparisons of both 
high and low outcome conditions with no outcome conditions are significant and in the expected 
direction for XH (t= 1.54, p< .06 and t = 2.15, p < .02, one-tailed, respectively). But for XL, only 
high outcome conditions are significantly different from no outcome conditions (t = 3.23, p « 
.001). The results for YH and YL are similar. For YH, high outcome conditions are significantly 
higher than no outcome conditions (t — 3.24, p « .001) but low outcome conditions are not 
significantly lower (t= .18, p 2.86). For YL, low outcome conditions are higher than no outcome 
conditions (t = 1.40, p « .08, one-tailed) but high outcome conditions are directionally opposite 
to expectations; they are higher rather than lower than the no outcome conditions. These 
likelihood results are generally weaker than the estimate results. Possibly, the steep slope of the 
graphs induces some ceiling and floor effects with respect to likelihoods of high and low 
outcomes. 

Accountability did not mitigate the curse of knowledge. ANOV As revealed no significant 
accountability effects for estimates or likelihood judgments with one exception (see table 4). A 
marginally significant accountability effect was found for the likelihood of a high outcome in 
graph Y; however, its direction is opposite to that of debiasing; curse of knowledge effects were 
exacerbated by accountability in that particular task. Finally, an insignificant outcome X group 
interaction for all dependent variables indicates that auditors' curse of knowledge is as great as 
that of MBA subjects. 


Discussion 


Curse of knowledge effects found in a going concern context in experiment 1 generalize to 
another context: analytical review. The auditor subjects included those from experiment 1 as well 
as those from Kennedy (1993). The auditors in Kennedy (1993) did not exhibit recency effects 
in that experiment. This experiment shows that those subjects are not immune from all judgment 
biases and reinforces the essence of the framework-—that biases originate from different sources 
(effort and data). The results of this experiment corroborate the results of experiment 1 and 
suggest that institutionalized systems of accountability, such as the audit review process, are 
unlikely to mitigate these effects. As in the previous experiment and particularly with respect to 
the MBA subjects, these results are not likely due to an ineffective accountability manipulation. 
The MBA subjects included those of Kennedy (1993)—the very subjects who were susceptible 
to recency except when held accountable for their judgments. In fact, a specific MANOVA 
comparison of pre-accountable subjects from Kennedy (1993) against nonaccountable subjects 
yields no significant accountability or accountability x outcome effects (F = .68, p < .66 and F = 
.74, p < .74) in this task, yet a strong outcome effect as hypothesized (F = 3.57, p < .001). 


Ideally, the MBA subjects from experiment 1 would be included in this experiment as well. Due to limited administration 
time for that experiment, this was not possible. 
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TABLE 4 
Experiment 2 
Analysis of Variance 

F-statistics 

Dependent Variable* Estimates Likelihood Judgments 
X Y XH XL YH YL 

Source of Variance DF 
Accountability? 1 0.47 1.28 0.00 0.00 3,59" 0.65 
Outcome* 2 21.97" 8.21" 6.647" 5.50" T 1.31 
Group 1 2,12 23.98" 1.96 0.32 1.94 28.68"" 
Accountability x Outcome 2 0.30 0.31 0.20 0.11 0.23 2.41* 
Accountability x Group 1 0.00 0.03 1.11 0.11 0.06 0.514 
Outcome x Group 2 0.23 1.18 0.14 0.49 1.21 0.60 
Account. x Outcome x Group 2 0.08 0.40 0.36 0.95 1.23 0.14 
Error 396 
Total 407 
Planned Comparisons DF t-statistics 
High vs no outcome 1 2.08" 2.63" 1.54" 323'" 324"  -1.40 
Low vs no outcome 1 4.5177 1.44" 2.15" .90 .18 1.40 


^ X and Y refer to the estimates for the 13th quarter’s sales for graphs X and Y, respectively. XH, XL, YH, and YL refer 
to likelihood judgments that 13th quarter sales were at least as high as the high outcome (XH and YH) or as low as the 
low outcome (XL and YL) indicated by the asterisk on graphs X and Y. Predictions regarding outcome knowledge for 
X, XH, Y, and YH are high > none > low, while predictions for XL and YL are low > none > high. 

> Accountable subjects were told before they saw the graphs that they might have to justify their responses. 

* High and Low outcome conditions saw asterisks higher or lower than the 12th quarter’s sales. 

vv ** and’ indicates statistically significant at p < .01, .05, and .10, respectively. 


These experiments do not suggest that the curse of knowledge cannot be debiased, only that 
accountability and experience are not the means by which to do it. Successful debiasing seems 
to require making a compelling response less salient. The following experiment tests this notion. 


‘VI. EXPERIMENT 3: COUNTEREXPLANATION AS A DEBIASING MECHANISM 


Counterexplanation is a potential debiasing mechanism that relates directly to the cognitive 
nature of the curse of knowledge. It requires individuals to focus on why the outcome might not 
have occurred and thus weakens causal connections between inputs to the judgment and the 
outcome that is observed (Heiman 1990; Koonce 1992; Lipe 1991). This debiasing mechanism 
is similar in spirit to one successfully used by Butler (1985) in a risk assessment task. He asked 
auditors a series of questions designed to redirect their attention from the specific risk assessment 
problem at hand toward a more general reference class and to consider their (limited) ability to 
make judgmental risk assessments. 
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Method 


Since experience was demonstrated to have no effect on the curse of knowledge in 
experiments 1 and 2, undergraduate students (143) were subjects here. This experiment was 
identical to experiment 2 except that (1) only two outcome conditions were used (high and low), 
and (2) a counterexplanation condition was added in place of accountability. 

Subjects in the counterexplanation condition were asked to explain in writing why the 
outcome (indicated by an asterisk) might be considered unlikely, before they made their estimate 
of what other subjects would estimate as the 13th quarter's sales for products X and Y. Clearly, 
this manipulation works against the tendency to make causal connections between the outcome 
and other information provided (Lipe 1992).? Subjects in a control condition were not asked to 
explain anything about the outcome. An outcome X counterexplanation interaction is predicted; 
control condition subjects are expected to exhibit curse of knowledge effects (high outcome » low 
outcome for X, XH, Y and YH and low outcome » high outcome for XL and YL) while 
counterexplanation condition subjects are not. 


Results 


The means and standard deviations for each dependent variable in each condition are in table 
5 and results ofa MANOVA and separate ANOVAs are in table 6. A MANOVA on the estimates 
(X and Y) and the likelihood judgments (XH, XL, YH, and YL) reveals a significant outcome 
effect (Wilks' Lambda, F = 10.6, p « .0001), and an outcome X condition interaction (F = 2.23, 
p «.05). Planned comparisons between the counterexplanation condition and the control for each 
dependent variable reveal that the nature of the interaction is consistent with expectations. 
However, the results are stronger for graph X than for graph Y, and stronger for the estimates than 
for the likelihood judgments. 

Subjects with high outcome knowledge estimate X and Y significantly higher than subjects 
with low outcome knowledge (p's « .01) except when they counterexplain (p's » .10). Similarly, 
the likelihood judgment of XH (XL) is significantly higher, on average, in the high (low) outcome 
condition than in the low (high) outcome condition when subjects do not counterexplain (p « .01) 
but not when subjects do counterexplain (p » .10). However, when YH and YL are examined on 
a univariate basis there are no significant outcome or outcome x counterexplanation effects.” 


Discussion f 
While not conclusive, these results suggest potential debiasing mechanisms conducive to 
professional skepticism. For example, requiring auditors to consider and explain why unaudited 


18 A MANOVA comparing undergraduates’ responses (control condition) to auditor and MBA responses in experiment 
2 shows no significant group effect (F = 1.38, p=.18) or outcome x group effect (F= 1.50, p= .12). When each dependent 
variable is examined there are no differences between subject groups for X, XH, or XL. For Y, undergraduate responses 
were between those of MBA and auditors and significantly higher than auditors (p « .04). Their responses to YH were 
higher than those of anditors and MBAs (p < .02). Similar, but only marginally significant results were obtained with 
YL (p < .07). In no case did the group effect interact with outcome. 

19 Although this manipulation is strong, it is not a demand effect since subjects are asked to estimate what others, who 
do not know the outcome, would predict. Subjects are unlikely to expect others to think of alternatives that were not 
obvious to them until prompted. This manipulation likely also requires additional cognitive effort. However, it is not 
the effort per se, but rather the way that effort is directed (i.e., toward considering alternative outcomes) that debiases 
the curse of knowledge. ' 

?) A gain, graph X was always first which may account for the stronger effect for X. Weaker results for the likelihood 
judgments could be due to ceiling and floor effects as in experiment 2, or, perhaps individuals find making point 
estimates easier than making likelihood judgments. 
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book values are unlikely as part of analytical review may be a low cost means of curbing these 
effects. Second, although these results were obtained with a strong manipulation that instructed 
subjects to be skeptical, aspects of the audit setting may encourage such skepticism. For example, 
Libby and Trotman (1993) argue that reviewers devote greater attention to evidence inconsistent 
with the conclusions drawn by their subordinates. Their results combined with those here suggest 
that the review process, which is already institutionalized in audit firms, may assist in debiasing 
the curse of knowledge afterall. Unlike in Kennedy (1993) and Tan (1994) however, it may be 
the reviewer rather than the reviewee who mitigates this bias. Further tests are required to clarify 
and determine the robustness of the findings here. 


VIL SUMMARY 


This paper investigates the curse of knowledge in judgment and the extent to which it can be 
mitigated by experience, accountability, and counterexplanation. Curse of knowledge effects are 
found among both auditors and MBA students in experiments using both going concern and 
analytical review type tasks. Accountability is ineffective in mitigating these effects. This 
research extends prior work by further testing the debiasing framework of Kennedy (1993). The 
curse of knowledge, identified a priori as data-related, was subjected to both effort-related and 
data-related debiasing prescriptions. Future research could test effort-related biases similarly. 

The practical implications of these findings for going concern evaluation and analytical 
review are important. For instance, the perceived culpability of auditors who do not modify 
reports of clients that subsequently fail may be greater in the eyes of shareholders, the SEC, expert 
witnesses, and jurors, all whom have one more piece of information than the auditors had— that 
the client did indeed fail. In cases of fraudulent reporting or lawsuits alleging negligence of 
auditors, itis common for other auditors to be called upon to review the audit papers and comment 
on the quality of the audit provided. Once the reviewing firm knows there is audit failure, the 
causal links between audit evidence and the discovery of fraud or improprieties may appear much 
stronger. With the benefit of hindsight, the reviewing audit firm may be less than generous in its 
interpretations of the diligence with which the audit was conducted. The implications for the use 
of analytical review are also important, particularly for audit firms that do not rely primarily on 
regression and other statistical techniques in forming their expectations for account balances. 
Because the auditor often has knowledge of unaudited book values prior to beginning the audit, 
he or she may be unknowingly influenced by this knowledge in forming expectations and in 
determining whether identified fluctuations require investigation.”! 

There are limitations, of course, to the experiments reported here. First, auditors work in 
much richer information environments than those provided here. Second, unlike in the audit 
environment, accountable subjects have no penalty imposed on them for providing unjustifiable 
responses in these experiments. Third, subjects were provided with extreme outcomes. Experi- 
ment 3 suggests mechanisms that encourage one to consider why the outcomes are unlikely might 
be most effective in counteracting the bias. Since the outcomes were extreme however, the 
debiasing mechanism forced subjects’ judgments to become more regressive. It is not clear what 
would happen if this same mechanism were used with less extreme outcomes. A point of 


21 According to Hirst and Koonce (1994) decision aids and statistical methods are rarely if ever used in analytical 
procedures at the planning stage. The most common expectation for the current year’s account balance at all stages of 
analytical procedures (planning, testing and review) is the prior year’s account balance. Potentially, curse of knowledge 
problems could be moderated somewhat in practice by relying on rules of thumb such as investigation of all deviations 
from last year that are greater than 15 percent. 
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reassurance though is that individuals’ judgments are typically insufficiently regressive. There- 
fore, curse of knowledge effects are likely less serious when the outcome is in the middle of the 

distribution of possible outcomes. More research that builds on this notion would be valuable. 
' Future research could also employ more realistic analytical review tasks that elicit auditors’ 
investigation intervals. Since this research was interested in the effects of auditors’ experience on 
the curse of knowledge, the task had to be sufficiently general that the comparison group, MBA 
students, could understand it. Hence, estimates rather than investigation intervals were elicited. 
Potentially, estimates could reveal curse of knowledge effects while investigation intervals might 
not. 
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ABSTRACT: In our model of negotiated transfer pricing, divisional managers can 
make specific Investments that enhance the value of intrafirm trade. However, these 
investments are irreversible and must be made before divisional managers have 
enough information to determine the desired intrafirm transfer. We find that a system 
of negotiated transfer pricing will lead to efficlent outcomes provided the divisions 
can sign fixed-price contracts priorto making their investment decisions. While these 
contracts are likely to be renegotiated after the relevant information becomes known, 
they nonetheless provide the divisions with effective protection for their specific 
Investments. 


Key Words: Decentralization, Transfer pricing, Negotiation, Investment. 


L INTRODUCTION 


URVEYS indicate that negotiated transfer pricing is a common way of accounting for the 
exchange of goods and services between the divisions (profit centers) of a firm.! In its 
purest form, negotiated transfer pricing is a laissez-faire system in which headquarters 


! See, for example, Price Waterhouse (1984) and Eccles (1985). 
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(H.Q.) gives the divisions complete control over divisional trade and the compensating account- 
ing charges. As a consequence, all transactions must be mutually agreeable to the divisions 
involved. Such a system stands in contrast to administered transfer pricing which limits the 
autonomy of divisions.’ 

It is commonly agreed that a major function of transfer pricing is to achieve coordination 
among the divisions of a firm. In the first place, divisional managers are supposed to focus on the 
profits of their own divisions. In order to induce effort and to mitigate problems of moral hazard, 
managerial compensation is generally tied to divisional profit. At the same time, the transfer 
pricing system should facilitate those intrafirm transactions that are in the entire firm’s interest. 

One would expect a tension between divisional and corporate profits when one division can 
undertake specific investments that are of little or no value in the division’s external lines of 
business. For instance, the supplying division may incur an upfront fixed cost which lowers its 
variable cost of producing an intermediate product used by the buying division. Without a viable 
external market, management of the supplying division may be reluctant to incur such an 
expense.’ To counteract such tendencies, central management could seek to impose an adminis- 
tered transfer pricing policy. Our main finding, though, is that a decentralized system of divisional 
profit measurement combined with negotiated transfer pricing can create desirable managerial 
incentives at the divisional level and, at the same time, solve the intrafirm resource allocation 
problem. 

The scenario considered in this paper assumes that divisional managers have symmetric 
information about the profitability of intracompany trade. However, neither division can verify 
the levels of specific investment or the relevant revenue and cost information to a third party such 
as H.Q. At the outset, the division managers can agree to a simple fixed-price contract, which 
specifies the quantity of the intermediate product to be transferred and the corresponding transfer 
payment. Subsequently, the divisional managers independently undertake their specific invest- 
ments. When the relevant revenue and cost information becomes available to both parties (but not 
to H.Q.), the prior agreement can be renegotiated to take advantage of the new information. 

Williamson (1985, chapter 1) and others have identified the “hold-up” problem for bilateral 
trading problems with specific investment. To illustrate the problem, suppose the parties do not 
sign a prior contract. Assuming equal bargaining strength, each party expects to receive half of 
the ultimate gains from trade. But these gains will not account for the prior investments since those 
investments are sunk at the later negotiation date. As a consequence, each party will tend to 
underinvest: it earns half of the expected gains from trade but bears the full cost of its own 
investment. Holmstrom and Tirole (1991) formalize this argument to conclude that negotiated 
transfer pricing suffers from underinvestment. 

When the divisions sign a simple fixed-price contract prior to investing, this agreement 
determines the status quo point in the final negotiation. The incentive to invest then has two 
components. For instance, if the supplying division spends a dollar on specific investment, it 
expects to receive half of the corresponding increase in joint profits (assuming equal bargaining 
power). In addition, the dollar of investment reduces the cost of producing the status-quo quantity, 
and thereby increases the resulting divisional payoff. For a suitable choice of the status-quo 
quantity, the latter effect provides the “other half" of the desired investment incentive. 

Earlier work on the hold-up problem has shown that the optimal level of specific investment 
can be obtained in equilibrium if the parties can commit themselves to play a particular game 


? See Kaplan and Atkinson (1989) and Eccles and White (1988). 
? Eccles and White (1988, 542—543) illustrate this tendency to underinvest in relationship specific assets in the context 
of “Bacon and Bentham.” 
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(mechanism) at the renegotiation date. The game is to be designed so that its equilibrium outcome 
varies with the underlying state in a way that provides each party with the appropriate investment 
incentives. The papers by Rogerson (1992), Hermalin and Katz (1993), Aghion et al. (1994) fall 
into that category.‘ In Chung (1991) and Aghion et al. (1994) the parties also sign a fixed-price 
contract at the outset, but in addition, they contractually agree to a particular renegotiation 
mechanism.? 

Most of the recent work on transfer pricing has been concerned with administered transfer 
pricing; see, for example, Harris et al. (1982), Jordan (1989), Christensen and Demski (1990), 
Amershi and Cheng (1990), and Ronen and Balachandran (1988). In these models, H.Q. specifies 
a transfer-pricing formula based on divisional reports and observed variables such as production 
cost. Negotiated and cost-based transfer pricing are studied in Mookherjee and Reichelstein 
(1992) and Vaysman (1994a, 1994b). These models allow for private information about 
divisional revenues and costs; however, they do not consider the possibility that divisions may 
incur upfront fixed costs which enhance the value of intracompany transfers. 

The paper is organized as follows. Section II describes the model. Initially, we shall take it 
as given that each divisional manager maximizes his/her division’s expected profit. Proposition 
2 in section III shows that when a certain separability assumption is satisfied, negotiated transfer 
pricing will induce efficient investments and quantity transfers. In section IV, the previous setting 
is embedded in a larger model, where divisional managers are subject to moral hazard. 
Proposition 3 shows that H.Q. can solve the combined managerial incentive and intrafirm transfer 
problem by offering each manager a compensation function that is linear in divisional income. 
The paper concludes in section V. 


II. THE MODEL 


Consider a multidivisional firm with headquarters (H.Q.) and two divisions. The two 
divisions are assumed to operate in separate markets except for an intermediate product which 
Division 1 (the "upstream" division) can supply to Division 2, the "downstream" division. 
Suppose there is no external market for the intermediate product because it is highly specialized. 
For this good the two divisions are thus in a bilateral monopoly situation. 

If q units of the intermediate product are transferred, Division 1 incurs an incremental cost 
C(4, 8, I), where 0 denotes a state variable which is unknown at the outset and J, represents the 
dollar amount of specific investment undertaken by Division 1. For example, the selling division 
may acquire fixed assets which reduce the variable cost of producing q. Similarly, the buying 
division receives an incremental revenue R(q, 0, I), if its specific investment was Z, q units are 
transferred, and the state of the world is 8.6 This revenue is stated net of any finishing costs 
incurred by Division 2 to sell its output externally. 


* A survey of recent papers addressing the hold-up problem is provided in Edlin and Reicheistein (1994). 

* In Chung’s (1991) model, the original agreement specifies that one party can make a take-it-or-leave-it offer to the other 
party at the final negotiation stage. Chung shows that when one party captures the entire ex post surplus, the original 
fixed-price contract can be structured so that both sides will invest efficiently. In order for such an arrangement to be 
credible, however, the party receiving the take-it-or-leave-it offer must believe that if it left the offer, there would be 
no further negotiation, and, in particular, no subsequent surplus sharing. In effect, H.Q.would have to ensure that once 
the offer is rejected, the good in question not be traded between the divisions. Such stipulations appear impractical and 
would run contrary to the notion of profit center autonomy. 

$ In our model, the variables q, I, and J, are all real-valued, while the state variable 0 can be of arbitrary dimension. A 
special case of interest is when 6 (0,, 0), 0, affects the revenue function, 0, affects the cost function and the two 
components of 8 are stochastically i nt. 
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Under a system of negotiated transfer pricing, the two divisions have to agree on a pair (4,9), 
where t is the transfer payment which is charged against the buyer's divisional income and 
credited to the seller's income. We note that negotiated transfer pricing makes it unnecessary to 
compute a unit price (though for inventory valuation purposes one may need to divide t by q). In 
contrast, a cost-based transfer pricing rule may allow the buying division to decide on the quantity 
transfer at a given unit price which is based on the supplier's cost. 

The sequence of events is as follows (see figure 1). At date 1, the division managers negotiate 
a fixed-price contract (q, t). Subsequently, the divisions undertake their specific investments J, 
and I. At date 3, both divisions observe the state variable @. In addition, each division manager 
is assumed to learn the other party's cost or revenue function, respectively. This informational 
situation may result because each division observes the other's investment or simply because 
financial information “leaks” across divisions. As a consequence, at date 4 the two parties have 
complete information about the incremental costs and revenues that result from transferring q 
units of the intermediate product. The divisions may then negotiate a new transaction (d, f) which 
replaces the original agreement (g, t) of date 1. 


FIGURE 1 
date 1 date 2 date 3 date 4 
CHS, Q, L) 9 (â À 
negotiated chosen realized renegotiated 


J. While the two divisions have symmetric information at all dates, H.Q. is assumed to observe 
neither the investments nor the actual state 0. This information asymmetry makes it necessary for 
H.Q. to design a mechanism that achieves coordination between the divisions. We assume that 
all parties share the same beliefs about 6; these beliefs are represented by a probability distribution 
F(-) defined on the set of possible states.’ 

The specific investments decrease cost and increase revenue, respectively. We assume that 
both functions are twice differentiable in q and J,i=J, 2, and that the contribution margin 
RC, 9, 1.) — CC, 6, I) is strictly concave in q for all 6, 1, and I. The transfer quantity q is restricted 
to the interval [O, q]. Given particular levels of i investment I, and I, and state realization 6, the 
efficient transfer quantity will be denoted by 4'(0, D where Iz (I, L). Thus q'(-) is the unique 
maximizer of: 


M(q, 0, I) =R(q, 0, I) ad C(q, 0, I) 


the contribution to the firm's overall profit. Let M(0, I) = M(q'(0, D, 0, I). The efficient levels of 
investment are the ones that maximize the firm's expected profit, 


Iu) =E [M(6, D) x [, 2 D (1) 


7 Our model may be viewed as an extreme case of the setting in Demski and Sappington (1984), (1989), where agents 
have private but correlated information. In our model the correlation is perfect. For models without specific investment, 
it is well known that with risk neutral agents the principal can achieve first-best allocations by designing a suitable 
revelation mechanism. We find that a transfer pricing mechanism that entails no reporting to H.Q. can also achieve first- 
best investments and allocations. 
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Here, the symbol E,[-] represents the expected value of a function with respect to the probability 
distribution F(-).2 Each division’s investment is restricted to the interval [0, 7... ]. We shall use the 
following assumptions: 


AD — — Ríq,0,1,) > 0 and ——Cf¢,9,1,) < 0 for all g,0, I, and I 
ADT (4,0, Iz) xx (4,6, 15) q,0, 1j 2 


(A2) The profit function 7'(-) has a unique maximum at /^— (I7, I), with 7; in the interior of 
(0, Zal- 


Assumption (A1) says that the specific investments increase marginal revenue associated 
with any given q and decrease the marginal cost. Forinstance, we may think of 7, as an expenditure 
for equipment that reduces the direct labor cost of each unit of the intermediate product. A direct 
consequence of (A1) is that specific investments increase the optimal transfer quantity q' 
(provided q > 4" > 0). Assumption (A2), in conjunction with the Envelope Theorem, implies 
that the optimal investments satisfy the following conditions: 


CT ue 
“12 C(q'(0, 17), 8, I; a (2) 


and 


D noce aac. 
5, Re (0,1 Ler- (3) 


To derive some of our results it will be convenient to impose the following technical 
condition: 


(A2^) Assumption (A2) holds and q*(0, I,, I.) is interior in [O, Gn) for all 0, I, and I. 


Central management is assumed to observe only the divisional income figures, denoted by 
IT, and I. The incremental contributions to divisional income, AIT, resulting from investments 
and intracompany transfers are given by: AIT, = R(q, 0, I) —t—I, and AIT, =t- C( q, 6, 1) —I,. 
Thus specific investment becomes an expense for each division, but imposes no personal costs 
on the manager. To the extent that divisional managers seek to increase their own division’s 
income, though, they may have a natural tendency to underinvest. 


III. INVESTMENT UNDER NEGOTIATED TRANSFER PRICING 


To assess the effectiveness of negotiated transfer pricing, one needs to specify how the 
division managers bargain with each other. We suppose that the parties agree on an efficient 
transfer (from the perspective of the bargainers) and a corresponding transfer payment that splits 
the available surplus. At date 1, each division manager expects the current agreement to be 
renegotiated at date 4, when all cost and revenue uncertainty has been resolved. 

We postulate that either division could insist on fulfilling the date 1 agreement, and therefore 
this agreement defines the status quo (or disagreement point) in the negotiation at date 4. For 


t Formally, EJ-] mJ [-JdF(9. 
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instance, the parties may instruct the bookkeeping office to record (4, t) at date 4 unless both sides 
agree to a different transaction (4, f). This instruction produces a rule similar to the specific 
performance remedy in contract law? Such a rule allows neither party to breach a contract 
unilaterally; it stands in contrast to a damage rule which would allow either party to breach the 
date 1 contract and pay the other party damages. 

Expectation damages is a more common remedy than specific performance in commercial 
disputes. Edlin and Reichelstein (1994) show that the expectation damages remedy is not well 
suited to solve bilateral investment problems. Our results therefore suggest that a vertically 
integrated firm, which adopts the specific performance rule, may have an advantage over asystem 
of inter-firm transactions, which is accompanied by an expectation damages remedy. We are not 
aware of any large sample surveys documenting what breach remedies are prevalent in 
multidivisional firms. Based on their field studies, though, Meurer (1993) and Shelanski (1993) 
provide some empirical support for the use of the specific performance rule. 

In the following analysis, we refer to the “surplus sharing rule as that giving Division 1 the 
y-share (0 S y S 1) of the available surplus and Division 2 the remaining (1 — y)-share. If the 
divisions agree to the prior contract (4, t )at date 1, and subsequently invest /, and L, respectively, 
the renegotiation surplus available in state 0 is given by: 


M(80, I) - M(q, 0, D. (4) 


Under ¥-surplus sharing, the parties will agree to the contract (d, f) at date 4, where 4 is the 
efficient intracompany transfer, i.e., q = q'(0, I) and the transfer payment satisfies: 


R(q*(0, 0, 1,) -£ = R(,8,) -t* (- 7) [M(0, I) - M(a, 0, D} (5) 


f — C(q'(8,D,0, 1) =t- C(3,9, 1) 7 [M(0, D) - M(3,0. D} (6) 


At date 1, the status quo point is (g,f) = (0,0) since either division may refuse trade. The date 
1 transfer payment f can also be derived from the rules of surplus sharing. However, since this 
transfer payment does not affect tbe divisions' subsequent incentives, we will not calculate it 
explicitly. We note that when y= +, the bargaining solution in (5) corresponds to the Nash 
bargaining solution. In the derivation of our results we first take the parameter yas fixed and 
assume that 0 < y< 1. Variations in yand the extreme cases of y=0 or y= 1 will be discussed below. 

Following the agreement (7,7) at date 1, Division 1 will choose its investment so as to 
maximize the expected contribution to its divisional income. Formally, Division 1 maximizes: 


T, L,3)8 Ef - C(3,0, 1,)+ y-[M(6. I) M(a,0, D]] — 1, (7) 


with respect to /,, given its conjecture about the investment L made by the other division. We 
denote by 7,(2, J,) the maximizers of (7). (It is possible that 7,(7, J, ) is a set since there may be 
multiple maximizers.) Analogous to (7), we define I7 (I, I,» q ) based on the right-hand side of 
equation (5). The optimal specific investments for Division 2 are denoted by 1,(q, L). 


? Unlike our specific performance analysis of interfirm trade (see Edlin and Reichelstein (1994)), in the current model 
the bookkeeping office serves the purpose of ensuring that the date 1 agreement is a credible threat. Alternatively, the 
parties could sign an agreement amongst themselves knowing that either side could subsequently ask H.Q. to uphold 
the agreement. The date 1 contract can always be structured so that one division would prefer to fulfill this contract rather 
than not trade at all. 


Edlin and Reichelstein—Specific Investment Under Negotiated Transfer Pricing 281 


.Our first result examines how each division’ s optimal investment responds to changes in the 
transfer quantity agreed to at date 1 and to changes in the other division’s investment. We say 
that if ( g, l) is strictly i increasing in q and J, „if for any g q, l, Ad 20 and Al, 20, the following 
is true. If 7, el, (g, L), with] >1,>0, and I, € if (q ATL + AL), then I, > I, (unless, of 
course, Ag =0=AJ,). 


Proposition 1: Given assumptions (A1) and (A29, I 1; (4, L) is strictly increasing in g and 
L, Similarly, i q , I) is strictly increasing in q and Z, 


The proof of Proposition 1 is given in the appendix. To provide the intuition behind the result, 
we note that equation (7) implies that for given 4 and 7, the first order condition for Division I's 
optimal investment choice is: 


i-a- y) 5008 ne PAS (8, 1), 6, m=i (8) 


Comparison of equations (2) and (8) shows that if 7 =0 (and 43- C(0,6, 1,) 50 ), Division 1 
will tend to underinvest, when 7, = I7. This is the hold-up situation discussed by Williamson 
(1985) and others. Because Division 1 receives only a share of the firm’s marginal return to 
investment, it is unwilling to invest the efficient amount. When d is positive, however, Division 
] receives an indirect benefit from its specific investment as well. Higher investment allows 
Division 1 to produce the quantity 4 at lower cost. Though this quantity will generally not be 
delivered in equilibrium, the potential cost reduction raises Division 1's status quo payoff at date 
4, and, as shown in equation (8), Division 1 will receive an additional return for its investment. 

Figure 2 illustrates Proposition 1 for two alternative values of 7. For simplicity, we assume 
in this illustration that each division has a unique best response, i.e., the sets /, (-) are single valued. 
The two intersection points, A and B, in figure 2 identify investment levels that form a Nash 
equilibrium at date 2, given the respective prior agreement 4, or Jy Figure 2 suggests that if for 
some reason Division 2 were constrained to choose I, = I; (possibly because the investment I, 
is observable), then there would exist some transfer quantity 7, with q, <q < dg, which would 
provide Division 1 with the desired incentives. In particular, Division 1’s reaction curve 7,(3, +) 
would pass through the point (Z7, I; ) .'? 

For our two-sided investment problem, it appears generally impossible that a single quantity 
q could induce both divisions to undertake the efficient investments 7; and 17, respectively. An 
important special case of our model, however, occurs when the revenue and cost functions satisfy 
the following separability assumption: 


(A3) R(4, 0, L) - RI) -q + Ry(q, 0) + R0, L) 
Cq, 9, 1) = CI) : q + Cg, 0) + C6, I). 


The economic interpretation of this separability assumption is straightforward: a dollar of 
specific investment by Division 1 reduces the unit variable cost of the intermediate product by 
C; (/), independent of the actual state 8. This feature is consistent with the interpretation given 


P Proposition 1 in Edlin and Reichelstein (1994) shows under certain regularity conditions a quantity 4 exists, which 
solves a one-sided investment problem. 
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before, where specific investment takes the form of production equipment that lowers direct labor 
costs for each unit of output. As before, we shall invoke assumption (A1), so that R; (Z) > 0 and 
C; 4) « 0. 


Proposition 2:Given (A1) - (A3), if the parties agree to the prior quantity: 


q = E,[4'(6, I5, 3) (9) 
at date 1, then the efficient investment levels f and J} form a Nash 
equilibrium at date 2. 


Proof: We demonstrate only that /; is an optimal choice for the supplying division given the 
conjecture that J, = I7. One can make a symmetric argument for the buying division. Suppose 
Division 1 conjectures that Division 2 will invest the efficient amount 77. As argued above, 
Division 1’s best response satisfies the first-order condition given by (8). If g is chosen according 
to (9) and the cost function C(q, 9, I,) satisfies the separability assumption A3, then: 


9 zn o 
E, 2 C(q, 0, 1) =@-C, (1)* ae CG, I, | 


g 
= E,4— C(ą*(0,1*),0,I 


for all 7,. Recalling equation (2), we conclude that HMC, 1D =0 al =l, 
It remains to show that J; is indeed a global maximizer of T, C, J; , 4). We note that when 
d -E,[q(80, Ij, 17)], 
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Qc ee 
i —I(1,10 
al, ner D; q)2 al, ( 2?) 


for I, « I; , while the opposite inequality holds for Z, > I7. Both inequalities follow immediately 
from the fact that E, [g' (0, I, 17)] is (weakly) increasing in 7,. Therefore, 


* yh * n à * n g * 
TEED- LD = Lhe rd 2 SPT I*)dt 
I 
- IU. D-ra, 12) 20, 


which shows that J is indeed a best response for Division 1. That completes the proof of 
Proposition 2. 

The additive separability assumption (A3) ensures that a single instrument can satisfy the 
incentive compatibility constraints of both divisions simultaneously. We may interpret the 
revenue and cost functions in (A3) as second order approximations of the “true” valuation 
functions. That is, if one considers a second order Taylor expansion of C(-) around some point (g^, 
6°, I?) then the resulting second order polynomials always satisfy (A3). Consistent with the 
intuition developed in figure 2, we conclude that a system of negotiated transfer pricing, which 
allows for renegotiation, will provide at least an approximate solution to the bilateral investment 
problem. 

The message of Proposition 2 is that renegotiation of simple fixed-price contracts can induce 
efficient investment. Moreover, the desired prior quantity g is invariant to changes in +. 
Intuitively, one might think that Division 1’s willingness to invest decreases as y decreases. 
However, consider equation (8). As y decreases, Division 1’s share of the ultimate gains from 
trade, i.e., M(6, I), decreases. But at the same time, Division 2 retains a larger share of the reduction 
in cost to produce g . The corresponding term is (1 — y) - C(q, 0, D). If 3 is chosen according to 
(9), the two effects cancel each other precisely. 

Itisinstructive to compare our findings with Chung (1991). In his model, the parties also sign 
a fixed-price contract at date 1. In addition, they contractually agree on a particular renegotiation 
process at date 4, in which one side makes a take-it-or-leave-it offer. This corresponds to a 
situation where y = 0 or y= 1. Chung finds that even without the separability assumption (A3) 
it is then possible to choose a prior quantity 4 so that both parties will undertake efficient 
investments. For instance, when y= 0, Division 2 will obviously have the desired incentive since 
it receives the entire renegotiation surplus and, therefore, the full return from its specific 
investment. In contrast, Division 1's incentives are not affected by the expected renegotiation 
outcome. It anticipates that it will be held to its.status quo payoff, and therefore it chooses J, so 
as to maximize —E, [C (4, 0, /,)] — I,- Chung (1991) shows that under “mild” conditions there 
exists a q such that -E, [C( 3, 0, 1,)] — I, is maximized at 77. 

Though Chung's analysis establishes an exact solution to the bilateral investment problem 
without assumption (A3), we have concerns about the implementability of his solution. The 
equilibrium incentives of Chung's mechanism rely crucially on the assumption that if Division 
1 rejects Division 2's take-it-or-leave-it offer at date 4, there will be no further negotiation. To 
make such a policy credible, H.Q. would have to monitor the process and prevent the divisions 
from trading the intermediate product in the foreseeable future, once Division 1 has rejected the 
other division's offer. Such interference appears costly and impractical in most environments. 

Propositions 1 and 2 above have relied on the fact that the quantity choice q is continuous. 
Rogerson (1984) considers a situation where the separability assumption (A3) is satisfied and an 
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indivisible good can either be traded or not, i.e., q € {0,1}. Rogerson finds that a prior agreement 
of q =0 will induce underinvestment while g = 1 leads to overinvestment. This conclusion is 
consistent with Proposition 2 since g = 1 exceeds E, [q'(0, r]. To obtain efficient investments 
with a binary quantity set, the prior agreement at date 1 has to involve some randomization.!! 

The foregoing analysis has simply assumed that the parties adopt the y-surplus sharing rule 
in their negotiations. Bargaining theory has shown that the same outcomes can also be obtained 
as non-cooperative equilibria of a game in which the players alternate in making offers. Suppose 
that at date 4 the manager of Division 2 first proposes an outcome (q, t). Division 1 can accept or 
reject the offer. If it rejects the offer, Division 1 proposes an outcome in the second round, which 
Division 2 can accept or reject. The game proceeds in this manner; Division 2 makes offers in odd- 
and Division 1 in even-numbered rounds. 

To impose discipline on the negotiating process, suppose there is an arbitrarily small but 
positive probability, £, that the game terminates whenever an offer is rejected. This chance of 
breakdown may reflect that a manager may be irritated by the other's refusal. Alternatively, 
managers may have to attend to other matters, and therefore cannot continue in the bargaining 
process. If the game terminates without agreement at any stage, the parties abide by the status quo 
outcome.’? Myerson (1991, Theorem 8.3) shows that this alternating offers game has a unique 
subgame-perfect equilibrium, in which Division 2's first offer is accepted by Division 1. The 
equilibrium transfer payment varies with £; as €approaches zero the equilibrium transfer payment 
approaches the value that corresponds to y= .5. 

Myerson's (1991) result suggests that one can obtain a purely non-cooperative version of 
Proposition 2. Given assumptions (A1)-(A3), suppose that division managers play the alternating 
offers game at dates 1 and 4. The resulting game has a subgame perfect equilibrium with the 
following properties: in the first round of the date 1 bargaining process the parties agree to a pair 
(q,t) with q—Ej,(q' (8, Ij, I7)). Subsequently, they choose the efficient investments 77 and 
I; , respectively. Finally, the first offer of the date 4 bargaining process is accepted leading to the 
transfer q'(0, I;, 15).? 


IV. MORAL HAZARD AND DIVISIONAL PERFORMANCE EVALUATION 


In the model presented thus far, divisional managers take no actions that impose personal 
costs on them. Nonetheless, it was assumed that each divisional manager maximizes his/her 
division's expected income. While the latter specification seems descriptive, one may object that 
in the current model it would have been easier to give divisional managers a share of total firm 
profits. H.Q. could then simply instruct managers to make investment and quantity transfer 
decisions in the firm's overall best interest, instead of the divisional interests. 

In this section, we expand the model to include moral hazard on the part of divisional 
managers. As described in section IL, the two divisions are assumed to operate in separate markets 
except for the intermediate product in question. Let x, denote operating income for Division i 
resulting from "external operations," i.e., from transactions that do not involve the other division. 


!! For instance, the parties could sign a contract which stipulates that with probability p there will be a transfer (4 = 1) 
at date 1, while with probability (1—p) there will be no transfer. Given (A3) and p = E, [q' (0, )], Proposition 2 continues 
to hold. 

12 Rather than appeal to a small exogenous probability of negotiation breakdown, one may alternatively postulate that the 
parties discount agreements reached in later rounds. This model has been studied by Rubinstein (1982); see also 
Myerson (1991). 

13 See Edlin and Reichelstein (1994) for a formal analysis of this model. 
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Suppose the manager of Division i can increase x, by taking actions that are personally costly. For 
the purpose of performance evaluation, H.Q. can observe only the aggregate operating results: 


2, =x, + R(q, 0, L) -L 
z,7x,— C(q, 0, 1) - 1, (10) 


This situation reflects “account fungibility;" H.Q. cannot identify whether a dollar in cost was 
incurred to support the intermediate product (transferred to Division 2) or to support another 
externally sold product. Given the transfer payment t, the divisional income figures become: 


H,-z,* t and II,-z,-t. (11) 


At the beginning of the period (and prior to contracting) each division manager receives private 
information, represented by a one-dimensional variable, £. The realized value of x, is assumed to 
depend on ¢, and on the manager’s unobservable actions. Let D (x, t) denote the personal cost 
manager i bears, if he/she wants to attain external operating income x, in state t. As usual, this cost 
represents the disutility of managerial effort or the utility foregone when discretionary expenses 
are reduced. 

When divisional managers also take actions that are personally costly, H.Q. faces a combined 
managerial incentive and intrafirm resource allocation problem. Our main finding is that this 
problem can effectively be decomposed: H.Q. selects suitable compensation schemes based on 
divisional income for each of the managers and adopts a policy of negotiated transfer pricing. 
Earlier literature has shown that under certain conditions linear incentive schemes are optimal for 
agency problems with asymmetric information and risk neutral agents. We note that the results 
of section III above apply without change if at date 0 the manager of Division i were given a 
compensation scheme of the form o, - H, + B, where 0 < a, < 1 represents a bonus parameter. 

If the manager of Division i seeks to maximize the expected value of a, - I, the outcome of 
y-surplus sharing is unaffected by the value of œ, In fact, for the Nash bargaining solution this 
is true by definition, since one of the axioms underlying the Nash solution is that the bargaining 
outcome is unchanged if either party's utility is scaled by a constant factor.'* If one takes a non- 
cooperative approach to bargaining and considers, for instance, the alternating offers game 
described in section III, it is also obvious that any equilibrium outcome for given (œ, œ) will 
remain an equilibrium if the œ’s change. 

To describe the sequence of events, suppose the timeline of figure 1 is extended to the left, 
as shown in figure 3. At date -2, the division managers receive their private information 7. It is 
public knowledge that 7, € [7,, 7; ], and that the prior probability distribution of T, is F(T), with 
density f(T). At date -1, H.Q. proposes a contract to the division managers, and at date 0, the 
managers report their private information to H.Q. 

Suppose that at date -1 H.Q. offers each manager a menu of compensation functions, each 


linear in divisional income. Thus, H.Q. commits to three functions 1B, (1,), &«,(1,), TH, (7, ) Send 


iJ 
specifying a fixed salary 8, a bonus parameter œ, and a target level for divisional income 17,. If 
the manager of Division i reports T, his/her compensation scheme becomes: 


“Intuitively, it may appear that the renegotiation surplus increases as the a,’ s increase. This would be true if the divisional 
managers could make direct side payments to each other. In our model, however, they can only transfer divisional 
income, which translates into personal income at the rate œ. : 
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FIGURE 3 
date -2 date -1 date 0 
(7,1) contracts (TaT) 
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B,(,) +œ (1,) [1 -1,(z,)]. (12) 


The bonus parameters ;(-) are chosen so as to provide divisional managers with appropriate 
performance incentives with respect tox, The target profits IT, (7) are equal to the expected value 
of IT so that the expected “budget variance” in (12) is zero. As a consequence, B(T) becomes the 
manager’s expected compensation. Its magnitude is given by some exogenous market constraint, 
which we normalize to zero, without loss of generality. Because of their private information, 
managers will earn informational rents. Given the assumptions stated below, these rents will be 
decreasing in 7. Thus, B(T) is decreasing with B( t, )=0. The exact functional forms for the triplet 
(BC), œ), T, (-)} are given in the proof of Proposition 3 (appendix). We refer to this triplet as 
a proper menu of linear compensation schemes. 

Following the approach of Melumad et al. (1992), and Vaysman (1994a), we invoke the 
following assumptions to ensure the optimality of a menu of linear incentive schemes. 


(A4) D(x, T) =b{T) d (x). Both b(-) and d{-) are positive, differentiable, increasing and convex. 
.. F(t) b(t). . TT 
(A5) The ratio Jn) Ye is increasing in T, 


The multiplicative separability assumption in (A4) implies that higher types T, face a 
uniformly higher cost of achieving external operating income x,. The convexity conditions ensure 
in particular that the cost functions satisfy the familiar single-crossing property. Assumption (A5) 
modifies the usual monotone inverse hazard rate condition, which requires that Jae be increas- 
ing in 7, ij 

The firm's net-profit is given by thé sum of the divisional operating "cash flows" minus the 
compensation paid to divisional managers. To find an optimal mechanism, H.Q. could design a 
(Bayesian) incentive compatible revelation mechanism. Such a mechanism would be rather 
complex since divisional managers would be asked to report information at date 0 regarding 7, and 
at date 3 regarding 0. H.Q. would have to specify policies for investment and intrafirm transfers 
based on tlie reports obtained. The resulting revelation mechanism would obviously involve more 
instructions and more reporting to H.Q. than the decentralized mechanism described above. 


Proposition 3: Given assumptions (A1)-(A5), a proper menu of linear compensation schemes 
for each divisional manager combined with a policy of negotiated transfer 
pricing maximizes the firm's net profit among all incentive compatible 
mechanisms. 





5 Laffont and Tirole (1986), Kanodia (1993) and others have also analyzed settings in which menus of linear contracts 
are optimal. In their analysis, the cost functions C (x, 1, are additively separable in x, and t, rather than multiplicatively 
separable as in (A4). 
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In the proof of Proposition 3 (see appendix), we consider a benchmark problem which would 
result if H.Q. could actually observe the specific investments and the state @ at date 3. We 
demonstrate that a proper menu of linear compensation schemes combined with negotiated 
transfer pricing achieves the same expected net-profit as the optimal revelation mechanism for 
the benchmark problem. The intuition is straightforward in light of Proposition 2 above. 
Independent of the bonus parameters determined at date 0, the division managers will, in 
equilibrium, choose first-best investments and intracompany transfers. At date 0, the contribu- 
tions to divisional income from internal transactions, i.e., R(q, 0, L) - I, — t and t — C(q, 9, 1) — 
I, are effectively viewed as noise terms. All parties have identical knowledge about these two 
random variables, including their expected values. Because of risk neutrality and the linearity of 
the incentive schemes, the presence of a noise term has no effect on managers' behavior. 

For the mechanism we have described, each divisional manager has a "strong" incentive to 
report truthfully at date 0 since the bonus parameter he will receive is independent of the other 
manager's report. Truthful reporting is not quite a dominant strategy, however, since the manager 
of Division j could tie his specific investment decision /, at date 2 to the value of œ resulting from 
i's reporting at date 0. Of course, the manager of Division j has no incentive to adopt such a 
strategy since the outcome of the transfer pricing negotiation is independent of the bonus 
parameters. 

It follows from Propositon 3 that for the purpose of performance evaluation it is sufficient 
to look at aggregate divisional income. Although H.Q. could disaggregate TI, into operating 
results and transfer payment (i.e., into z, and f), Proposition 3 shows that divisional managers will 
have the desired incentives, if z, and t receive the same weight in their incentive schemes. 


V. CONCLUDING REMARKS 


This paper has analyzed a system of negotiated transfer pricing in which divisions can agree 
onasimple fixed-price contract and renegotiate this contract on arrival of better information. Such 
arrangements are sufficient to provide each division manager with the incentive to make upfront 
investments that are in the entire firm's interest. Moreover, when the divisional managers are 
subject to moral hazard, it is possible to solve the resulting incentive and resource allocation 
problem in a decentralized manner: managers are evaluated on the basis of divisional income and 
the divisions negotiate the transfer payment associated with interdivisional trade. 

Our analysis points to the importance of allowing divisional managers to renegotiate prior 
agreements upon the arrival of new information. Furthermore, it is essential that each manager 
perceives that he/she could insist that the prior contract be fulfilled. For select firms, Meurer 
(1993) and Shelanski (1993) indicate that central management imposes such a specific perfor- 
mance rule when divisional managers seek to "get out" of intra-company agreements. Future 
empirical research on transfer pricing could provide further evidence on the use of the specific 
performance rule. 

While our results support the common use of negotiated transfer pricing, it is also natural to 
ask why many firms prefer alternative rules of transfer pricing. First, our model neglects the cost 
of "haggling," which is frequently mentioned as a drawback of negotiated transfer pricing. The 
cost of bargaining may be particularly relevant when there are more than two divisions involved. 
For instance, the fixed cost incurred by the upstream division may benefit several downstream 
divisions. A negotiated system would then require some form of multilateral bargaining or a series 
of bilateral negotiations. 

We recall that the results of this paper hinge on the assumption that division managers have 
symmetric information about revenues and costs in their bilateral negotiations. Bargaining theory 
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suggests that with incomplete information the equilibrium outcome will generally not be 
efficient. An administered transfer pricing policy may partially overcome these inefficiencies. 

Holmstrom and Tirole (1991) point out that it may be difficult to sign a prior contract because 
the good to be transferred cannot be specified at the outset. The usual hold-up argument then 
applies and the parties will underinvest. Itis conceivable that some cost-based transfer pricing rule 
may ameliorate the underinvestment problem. On the other hand, if the cost-based rule is 
administered by H.Q., and not subject to renegotiation, it may lead to quantity transfers that are 
ex-post inefficient. As a consequence, there may be a tradeoff between ex-ante incentives for 
investment and ex-post allocative efficiency. 


APPENDIX 


Proof of Proposition 1: Suppose /,€ Î, (4, I) and i, € Î, (g + Ag, 1,+Al,), where Ag >0 
and AI, 2 0. We establish the claim that 7, » 7, in two steps. We first consider any I; min ( f, 
(q "s I)} and ud that 7; >J, . The ad step then shows that I; i Í: 

By definition, (1, DZI; 1, 7) and (Ij, L, d +Ag) 20,1, g +Aq). Adding 
these inequalities, we AES 


DC lp | +44)- DQ, I, d c Aq) -ITCIr D, 4) - Is D, 4) 20. 


The left-hand side of (i) is equal to: 


E T xa — T (u, L, v)du dv. 
From equation (7) we obtain that 





e ol, gol, 
By assumption (A1), the right-hand side of (iii) is positive. Therefore the integrand in (i7) is 
positive, and since the value of the integral must be Aor it de that J; 2/,. It cannot 
be true that J; = J, since that would imply £l, (1, 0- dI ,I,g Ag). contradicting 
assumption (Al). Therefore I; >L, 
When AL-0,it follows from the definition of I? that 7? < l5 To show this inequality when 
AL»0,we dopi the same sequence of arguments, noting that 


—— Iu L.v)s-(I- »& d eb 


Aja, ——— R(q* (0, u, v), 0, v)- x *(0, wv > 0. 


Finally, if Ag =0 but AJ, > 0, we can establish along the same lines that I, > I, using again the 
fact that x», (uv, g)» 10. That completes the proof of Proposition 1. 


g? 
JLA, — — T, (u,v, g + Aq) = y- apo 


Proof of Propositon 3: Consider first the following benchmark problem: H.Q. observes x, 
directly, and I, q and 0 as well. To prove the claim, we establish in Step 1 the solution to the 
benchmark problem. We then show in Step2 that suitable linear incentive schemes combined with 
negotiated transfer pricing lead to the same expected net-profit as in the benchmark problem. 


Step 1: Underthe benchmark problem, H.Q. offers the manager of Division i acontract of the form 
(x, (0), H (7) ) . If manager i reports 7, he/she must deliver x, (7) and is paid H (7). Suppose each 


(i) 


(ii) 


(ii 
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manager’s reservation utility is normalized to zero. It can then be shown with standard techniques 
that any incentive compatible mechanism has to satisfy: 


NET (i) 
H(t.) = D(x (T, hè, + | Daly) y)dy 
Furthermore: ide 
E) (ii) 
E, [H2] = E, | O+ 05800) 405) | 
Pd 


Equation (ii) makes use of the multiplicative separability assumption (A1). The right-hand side 
of (ii) is the expected “virtual” cost (true cost plus informational rent), which varies with the policy 
x(1). The optimal policy x; (-)istheonethatmaximizes (pointwise) the difference between profit 
contribution and payment to the manager, i.e., 


F 
x*(t,)€ mee : - ^ (t,)+ rong (T, ar x, | (iii) 


For notational brevity, let E, [ H; (1,)] represent the right-hand side of (i) evaluated at an optimal 
policy x; (7). It follows that the maximum net-profit attainable under the benchmark scenario is 
given by: 
2 
& (ari r6) + E| M(0,1*)] 17 - 17. 
ray (iv) 
Step 2: We construct a menu of linear incentive schemes based on divisional income and then 
verify that the resulting net-profit achieves the upper bound given by (iv). Specifically, define the 
functions a@(1,), B{t,) and 11,(7) as follows: 


a,(T,) = IOBDEN (v) 

B(T) = b())-d 16) O) G1 0d) (vi) 
HT, (t,) = x3(t,) + E| (a*(6, 17 ),0,12) -1,(0, |- 1 (vii) 
IT, (t,) = xj(7,)4 E|t, (6. I) - C(q' (0, 17 ),0,17)|- 15. (viii) 


Here, t, (0, I") denotes the final transfer price that results at date 4 under the y-surplus sharing rule 
if the state is 0. Suppose each division manager reports 7, truthfully at date 0. Subsequently, each 
will choose x, so as to maximize @(7,) - x, — b(1)) -d(x). Inspection of (v) shows that the optimal 
choice is indeed x; (7). The expected compensation payment for manager i is given by: 


E,|o;(1;)- (I1, - I,(2,))| + B,(t,). (ix) 
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By Proposition 2, the divisions will implement 7* and q'(0, I”) if they adopt the y-surplus 
sharing rule. Furthermore, the transfer price will be £ (6, I") in state 0. Hence the expected 
contributions to divisional income from internal operations are the respective expressions on the 
right-hand side of (vii) and (viii). It follows that E Tor(7) - (IT, — TT, (7))] = 0. By construction of 
(vi), BÈT) = H(t); therefore, the expected compensation for each manager matches the 
compensation paid in the benchmark problem. 

To complete the proof, one needs to check that the menu of contracts given by (v)-(viii) is 
indeed globally incentive compatible. By assumption (A5), the bonus parameters œ(-) are 
decreasing in 7, i.e., lower cost types receive larger bonus parameters. This property is necessary 
for incentive compatibility. For the remaining details of the argument, see.a similar proof in 
Melumad et al. (1992). 
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ABSTRACT: Many states removed their bans on direct uninvited solicitation during 
the 1980s, while others retained their restrictions. This period of contrast provides 
an opportunity to examine and provide insight Into associations that exist among 
information dissemination, client-auditor alignment, and auditor independence. 
Although concems have been voiced that recent changes in competitive conditions 
In the audit market are detrimental to audit quallty, we present arguments and 
evidence to the contrary. Results of a loglstic regression analysis suggest that, 
ceteris paribus, auditors in the market allowing solicitation are more likely than those 
in the market banning solicitation to issue a nonstandard report. 
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I. INTRODUCTION 


HE market for auditing services changed in recent years with the removal of restrictions 
on direct uninvited solicitation in many states. Whether the recent loosening of the 
standards that once prohibited accountants from cold phone calls and other hard-sell 
marketing techniques has led to increased "shopping" for opinions with a concomitant effect on 
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auditor reporting decisions has been the subject of much debate.’ In contrast to the claims of 
opponents that direct solicitation adversely affects both independence and auditor performance, 
we argue that conditions in markets allowing direct uninvited solicitation could improve the 
average quality of the auditor’s reporting decision, i.e., auditors may be more likely to issue 
nonstandard reports when such reports are warranted. Our analysis of the independent auditor’ s 
decision focuses on the likelihood of receiving a first-time qualified, or nonstandard, audit report. 
Our definition of a nonstandard audit report includes those reports which were modified because 
of material uncertainties, going concern uncertainties, departures from generally accepted 
accounting principles and scope limitations. Those auditor reports which were modified because 
of changes in accounting principles, with which the auditor concurred, are not included in the 
definition. 

While a number of state boards removed direct solicitation restrictions during the late 1970s 
and 1980s, other accountancy boards retained rules that prohibited direct solicitation during this 
period.” The lack of uniformity in state boards’ direct solicitation rules during the 1980s allows 
cross-sectional examination of the extent to which direct solicitation rules affect auditor 
decisions. Audit quality as revealed in auditor decisions has long been of interest to the profession 
and increasingly soin recent years with the surge of litigation-related problems confronting many 
audit firms. Concerns have been voiced that changes in competitive conditions in the audit market 
are detrimental to audit quality. Solicitation represents one area in which a change has occurred 
in recent years. Any evidence as to whether such a change in competitive conditions has led to 
changes in auditor decisions has potential implications for other issues in the industrial 
organization of the market for audit services. In particular, the change in solicitation policy 
provides an opportunity to examine and provide insight into associations that exist among 
information dissemination, auditor switching, client-auditor alignment, and auditor indepen- 
dence. 

Logistic regression analysis is used to test the null hypothesis that the decision to issue a 
standard or nonstandard opinion does not differ between markets allowing and banning direct 
uninvited solicitation (henceforth the allowed and banned markets, respectively) in a sample of 
clients with a relatively high probability of receiving a nonstandard opinion. Choi and Jeter (1992) 
suggest that nonstandard audit reports are most likely in firms whose underlying operations have 
undergone economic or structural changes. Using models developed by Dopuch et al. (1987) and 
Bell and Tabor (1991), we identify clients with a relatively high probability of receiving 
nonstandard audit reports and explore differences in actual audit report distributions between 
allowed and banned markets. 

Results indicate that, among small clients, auditors are more likely to issue nonstandard 
reports in the allowed market in samples where the estimated probability of receiving a 
nonstandard report is relatively high, based on the Dopuch et al. (1987) and Bell and Tabor 


! In 1976 the U.S. Senate Subcommittee on Reports, Accounting and Management (Metcalf Committee) expressed 
concerns that solicication restrictions were detrimental to the public interest because users of accounting services were 
deprived of information needed to evaluate the types, amounts, and prices of services offered. In contrast, in 1978, the 
AICPA's Commission on Auditors’ Responsibilities, the Cohen Commission, expressed concern regarding the 
profession's ability to safeguard "professionalism and independence" in a highly competitive environment. The 
Commission asserted that competitive pressures adversely affect the quality of work performed by individuals auditing 
particular clients because accounting firms often cut costs to the point wbere the integrity of the independent audit is 


impaired. 
? By 1994, every state which provided us information regarding solicitation policy had dropped its bans on direct 
solicitation. 
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(1991) models. The results of this research reveal a lack of support for the contention that auditors 
are more likely to allow their independence to be compromised or their audit effort to be reduced 
due to increased competitive pressures in markets allowing direct uninvited solicitation. 

The remainder of the paper is organized as follows.. Section I describes the economic 
development, section III describes the research method and design, and section IV presents 
estimation and results. Section V is a conclusion. 


IL ECONOMIC DEVELOPMENT 


Researchers (Watts and Zimmerman 1986; DeAngelo 1981a, 1981b; Kanodia and Mukherji 
1994; O' Keefe and Westort 1992; Deis and Giroux 1992) have examined two primary determi- 
nants of the auditor's reporting decision: independence and competence. Watts and Zimmerman 
(1986) define the attributes of auditor competence and independence as follows: the market's 
expectation of the auditor's competence is the probability that the auditor discovers a breach, 
conditional on a breach existing, which depends on how much effort the auditor devotes to the 
audit, the skill of the auditor, etc.; auditor independence is the probability that the auditor reports 
honestly if he/she observes a breach. Auditors have incentives to devise mechanisms to increase 
the market's expectations of their competence and independence, so long as the additional audit 
fees cover the costs of such mechanisms. However, auditors would not be expected to be either 
perfectly independent or completely competent, as the design of an audit program which would 
discover all breaches and ensure that all breaches be reported would likely be prohibitively costly. 
"Given competition in the auditing profession, the auditor cannot systematically depart from the 
optimal level of independence and survive." (Watts and Zimmerman 1981, 9) 

Recent research suggests a number of factors which may explain variations in auditor 
independence and/or competence. For example, Magee and Tseng (1990) argue that a reduction 
in independence is most likely in cases where the accounting standards are not specific enough 
that all auditors will agree on the preferred treatment.’ Beck et al. (1988) suggest that manage- 
ment advisory service involvement may increase the economic bond between auditor and 
client and lead to reduced independence. Eichenseher et al. (1989) and Palmrose (1988) find 
evidence that the market perceives "brand name" auditors to be more independent and/or more 
competent. 

In the context of the current paper, however, the issue is whether the removal of direct 
solicitation bans causes a shift in the optimal levels of independence and/or competence and hence 
differences in the auditors’ reporting decisions. Thus, our interest is restricted to factors affecting 
independence and/or competence which are deemed likely to differ between banned and allowed 
markets. Such factors include auditor tenure and alignment; recent research suggests that the 
frequency of auditor switching differs in the allowed and banned markets, and these factors may 
be linked to audit quality (independence or competence) issues. Throughout this discussion we 
focus on the auditor's decision when conditions appear to warrant a nonstandard report. 


* Magec and Tseng (1990) argue that a reduction in independence will only occur if the following conditions are present: 
(1) auditors must disagree among themselves on a client's reporting issue, (2) at the time of initial engagement, auditors 
do not know their positions on that issue, (3) the client does not know the incumbent auditor' s position on the issue, 
(4) the issue must affect more than one reporting period, and (5) the client must benefit from the preferred strategy even 
after an auditor switch. 
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Hereafter, any discussion of an incorrect report refers to the issuance of a standard report when 
a nonstandard report is warranted.* 


Tenure and Alignment in Banned and Allowed Markets and Audit Quality 


A number of researchers have suggested that the quality of auditor reporting decisions 
deteriorates as auditor tenure increases (Mautz and Sharaf 1961; Deis and Giroux 1992; Beck et 
al. 1988). For example, Beck et al. (1988) use tenure as a proxy for a lack of independence. 
Similarly, Deis and Giroux (1992) contend that as an auditor's association with the client 
lengthens, the auditor may become complacent and less likely to use innovative procedures. This 
complacency could be argued to lead to a decline in competence, or the likelihood of discovering 
facts necessitating a qualification. Mautz and Sharaf (1961, 208) suggest that over a long 
association with a particular client, a “slow, gradual, almost casual erosion of his honest dis- 
interestedness” may impair an auditor’s independence. 

In a market where direct solicitation is allowed, average auditor tenure is likely to be shorter, 
ceterus paribus, as a result of higher expected turnover. Chaney et al. (1993) examined the 
association between direct solicitation and client-auditor realignments for clients within the Big 
8 market and found evidence of significantly more frequent auditor switches in the market 
allowing direct solicitation than in the market banning direct solicitation. 

Shorter tenure in the allowed market may result from ane of two differences between the two 
markets: (1) differences in the flow of information, or (2) differences in transactions costs. When 
direct solicitation is banned, the exchange of information between nonincumbent auditors and 
prospective clients occurs only at the invitation of the clients, i. e., when clients are dissatisfied 
with their auditors and are thus motivated to search for new auditors. If a client is not overtly 
dissatisfied with an incumbent audit firm, she may not initiate a search for a new auditor; but if 
nonincumbent auditors approach the client, the client may agree to consider the proposal or 
proposals. When audit firms can self-select and offer their services to clients, prospective auditors 
may explain any unique or special services their firm offers, strengths their firm possesses relative 
to the incumbent (technological advantages, for example), any relevant concentration of particu- 
lar types of clients (industry-specific knowledge, for example), etc., in addition to providing 
client-specific input on the costs of performing the audit. Thus, the policy of allowing direct 
solicitation provides clients with more ready access to information about the alternatives 
available to them, auditor technologies, auditor specializations, etc. and increases the probability 
of turnover.” 

In addition, the behavior of auditors in the banned and allowed markets may be influenced 
by the expectation that auditor tenure is longer in the banned market. DeAngelo (1981a and 
1981b) suggests that auditors' independence is threatened by the existence of a stream of client- 
specific quasi-rents, defined as the excess of the auditor's revenues over avoidable costs. 
According to DeAngelo, auditor independence is threatened when an incumbent auditor lowers 
audit quality to retain quasi-rents.$ If auditors in the banned market have an expectation of a longer 


* We recognize the possibility that an auditor might issue a nonstandard report due to a lack of competence in a situation 
where a standard report is appropriate, but we consider this situation to be so rare that detection of a difference between 
the two markets is unlikely. This issue is left for future research. 

* Interestingly, arguments could be made to predict just the opposite in the long-run. If the policy of allowing solicitation 
leads to better client-auditor alignment, we would expect to see more frequent switching in the early years of solicitation 
(those in our test period) but might see less frequent switching in later years. This issue is also left for future research. 

$ Kanodia and Mukherji (1994) question DeAngelo's assumptions and suggest that, when the incumbent auditor has 
private information regarding client-specific audit costs, the client's threat to terminate an incumbent auditor is not 
credible. 
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tenure, on average, then a longer and hence larger stream of quasi-rents in that market, following 
the DeAngelo premise, would present a greater threat to auditor independence. 

A second argument relates to potential differences in transactions costs between the banned 
and allowed markets. In DeAngelo's paper (19812), quasi-rents are positively associated with 
both nonincumbent start-up costs and client switching costs because the incumbent revenues are, 
in part, based upon nonincumbent start-up costs and client switching costs. Since auditors in the 
allowed market can self-select and approach those clients to which they are best suited or 
matched, the pool of auditors the client considers in the allowed market is more likely to include 
auditors with potentially lower learning costs. If a better alignment of client and auditor occurs 
in the allowed market due to unrestricted information flow, then these transactions costs may be 
Jowered. For example, if the nonincumbent auditor in the allowed market specializes in a 
particular industry or has special expertise or technologies suitable for the prospective client and 
communicates these skills or specialties, then the learning costs will be reduced relative to those 
in the banned market. 

To the extent that the transactions costs of switching auditors are lowered in the allowed 
market, the incumbent's fee and hence the client-specific quasi-rents are reduced in the allowed 
market. Thus an auditor in this market stands to lose less from a particular client and would be 
less likely to risk independence impairment. Ceteris paribus, these arguments suggest that 
auditors are more likely to issue a nonstandard report in the allowed market than in the banned 
market when circumstances indicate that such a report is appropriate. This prediction directly 
contrasts with arguments that pressure to acquire or maintain audit clients in a highly competitive 
environment (such as the allowed market) leads to impairment of audit quality. While the allowed 
market may indeed be more competitive, economic incentives must exist for auditors to lower 
independence or competence. The above arguments provide no indication that such incentives are 
increased in the allowed market and suggest, to the contrary, that independence and/or compe- 
tence levels may be greater in the allowed market. The following analysis examines whether 
auditors issue nonstandard reports more sparingly in the banned market than in the allowed, while 
controlling for other variables deemed likely to influence the auditors' decisions. 


Size Effects 


In an effort to assess the impact, if any, that solicitation policy has on the audit decision 
rendered, it must be acknowledged that there are numerous factors that influence competitive 
forces, and thus an auditor's tendency to sacrifice independence or reduce effort in the audit 
market. One factor which has been argued to have such influence is client size. Arguments have 
been raised that financial statement users perceive auditors of large clients as less independent ` 
(Knapp 1985; Pany and Reckers 1980; Deis and Giroux 1992), perhaps because larger clients 
represent a larger portion of the auditors’ revenues. Also, Chow and Rice (1982) present evidence 
that small clients are more likely to receive nonstandard audit reports than large clients. 

On the other hand, arguments could be advanced that auditors of small clients are more likely 
to suffer from a lack of independence. Simunic (1984) presents evidence that companies 
purchasing both audits and management advisory services (MAS) are, on average, somewhat 
smaller in size than companies not purchasing MAS. Beck et al. (1988) argue that management 
advisory service involvement poses a threat to auditor independence by increasing the economic 
bond between auditor and auditee. Briloff (1994) draws attention once more to the conflicts of 
- interest and the potential contamination of the auditor’s independence in cases where auditors 
also are consultants. Recent data from a United Nations Survey point to significant increases in 
management consulting revenues from 1978 through 1991, as well as to large percentages of total 


298 The Accounting Review, April 1995 


revenues from management consulting services for some firms.” The SEC’s position regarding 
required disclosure of percentage of fees from MAS is consistent with this concern. If auditors 
of small clients are more likely to serve as consultants to those clients, quasi-rents from consulting 
may pose a threat to the auditors’ independence, particularly for smaller clients. Also, it might be 
argued that the ownership structure of smaller clients is generally more closely held and that their 
boards of directors and audit committees tend to be less independent. 

In view of these conflicting arguments, we do not predict a directional effect of size on the 
type of audit report issued, but client size is an important factor to control for and to consider in 
our interpretation of the effects of solicitation on the likelihood of audit firms issuing nonstandard 
audit reports. Furthermore, we argue that allowing solicitation provides auditors with additional 
opportunities to communicate information to prospective clients. Since larger clients tend to 
operate in a more sophisticated information environment on average than smaller clients, they are 
more likely to be already aware of the alternatives, specializations, etc. communicated by 
allowing solicitation. Hence the gain in information may be greater for smaller, less sophisticated 
clients. Thus, in addition to considering the effect of size on the type of report issued, we also 
consider potential interaction between client size and solicitation policy. 


III. RESEARCH METHOD AND DESIGN 


Empirical detection of differences in auditor reporting decisions between the allowed and 
banned markets depends upon our ability to control for extraneous factors, such as the size effect 
discussed in the preceding section, and to identify a sample of clients where nonstandard reports 
are likely to be appropriate. For both objectives, we use models developed by Dopuch et al. (1987) 
and Bell and Tabor (1991) to predict first-time uncertainty qualifications. The Bell and Tabor 
(hereafter B-T) model includes financial statement variables, while the Dopuch et al. (hereafter 
DHL) model includes financial and market variables associa-ted with the issuance of nonstandard 
audit reports. B-T suggest that their model is useful as an aid to auditors early in the audit for 
forming an expectation of engagement risk, as an aid in making the final audit decision, and as 
a tool in lawsuits on behalf of auditors who fail to issue nonstandard reports and are subsequently 
sued. Similarly, DHL claim that their model can be useful as an audit tool to screen potential 
clients and to identify clients that are likely to receive nonstandard reports, and as a benchmark 
to represent the probability that a typical auditor would issue a nonstandard report in (1) peer 
review committees, (ii) in debates about opinion shopping, (iii) in quality control procedures, and 
(iv) in court cases involving auditor reporting negligence. 


Probability Estimates 


In addition to using the B-T and DHL models to identifv factors which affect the likelihood 
of receiving a nonstandard audit report, we also use these models to identify a sample in which 
the probability is relatively high that a nonstandard report is likely to be appropriate for each 
client. In a population of clients with low probabilities of receiving a nonstandard audit report, 
it is likely to be very difficult to pick up any difference between the two markets. When the 
benchmark report is a standard report, we expect most firms to issue a standard report regardless 
of their solicitation policy. A distinction between the two markets, if such a distinction exists, is 


7 As one example, data on Arthur Andersen reveal a 635 percent increase in MAS revenues from 1978 to 1987 and a 170 
percent increase from 1987 to 1991; in 1992 MAS revenues represented 46 percent of gross U. S. revenues for Arthur 
Andersen. 
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more likely to be detected in a population of clients with relatively high probabilities of receiving 
nonstandard audit reports. Both the B-T and DHL models are designed to distinguish between 
clients likely to receive nonstandard reports and those likely to receive standard audit reports. 

To compute estimates of the probabilities of receiving first-time nonstandard reports, we 
apply the B-T and DHL models to two separate sets of clients conforming to the B-T and DHL 
sampling criteria respectively. We estimate the probability for each client of receiving a 
nonstandard audit report using variable coefficient estimates as reported in their papers.? If the 
estimated probability of receiving an audit report modified because of uncertainties is less than 
0.5 for a particular client, we eliminate that client from our B-T and DHL samples. In addition 
to increasing the power of our tests, excluding the lower probability levels from our samples 
accomplishes two other purposes as well. First, by focusing on samples of firms where the 
probability of receiving a nonstandard report is high, we are focusing on situations where the 
uncertainty about reporting issues (e.g., as alluded to by Magee and Tseng) is likely to be high 
as well.? 

Second, we considered the possibility that economic, regulatory, or other differences among 
states might affect the proportion of first-time nonstandard reports across states. When the 
sample is limited to clients identified by the B-T and DHL models as having a relatively high 
probability of receiving a first-time nonstandard audit report, controls for many of these factors 
are present. While there are undoubtedly some clients who receive first-time nonstandard audit 
reports for reasons not captured by these models, those clients are unlikely to be a part of the 
sample at high probability levels. For example, if certain industries concentrated in particular 
states were prone to receive a nonstandard report during our test period for regulatory reasons not 
captured by financial or market variables, those clients will be excluded from the high probability 
levels. This, in part, serves to eliminate potential bias created by this type of concentration. The 
higher probability levels are thus limited to clients likely to receive a first-time nonstandard audit 
report for reasons captured by financial and market variables, resulting in a more homogeneous 
sample of clients. 

By limiting our sample to clients identified as having a relatively high probability of receiving 
a first-time nonstandard audit report, we acknowledge certain limitations. Our sample of clients 
is not representative of the entire population of audit clients. Further, a possibility exists that 
auditors might be more likely to jeopardize their independence in cases where the probability is 
relatively low. For example, large, financially healthy clients might pose a greater threat to 
independence and yet be excluded from the high probability levels. 


* As an alternative approach, we reestimated the coefficients for purposes of sample selection and then ranked the 
probabilities of a nonstandard audit report. The predictive ability of the B-T and DHL models using the original 
coefficient estimates was approximately the same as using the reestimated coefficients. We performed our analyses on 
samples of the highest probability firms with various sample sizes, with the resulting coefficients very similar to those 
reported in tables 6 and 7. 

? We do not claim to determine whether the reports rendered in the respective markets are optimal for the cost/benefit 
tradeoffs facing the auditors; i. e., the tradeoff between the risk of losing clients by issuing nonstandard reports too often 
and the tbreat of lawsuit (and loss of reputation) by not issuing nonstandard reports often enough. 

10 As a precursor to this analysis, we prepared a graph for each state from 1980 through 1987 of the percentage of 
nonstandard audit reports. We included years and states in which some or all of the direct solicitation information needed 
to be included in our sample was missing, with appropriate notation. We examined these graphs in search of any unusual 
patterns, such as particularly high percentages of nonstandard reports in particular years or particular states. No such 
patterns were noted. We also considered the possibility that states and years where information was unavailable (and 
hence excluded from our tests) exhibited any systematic differences from those state-year observations included. No 
such systematic differences were noted. 
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Solicitation Policy Data Collection 


Determining the sample of clients required both historical information regarding CPA 
boards’ solicitation policies and information on each client’s primary place of operations. 
Historical information was obtained by mailing questionnaires to all 50 state CPA boards. The 
boards were asked to provide copies of the Code of Ethics provisions regarding direct solicitation 
in effect in 1975 as well as copies of amendments to these ethics provisions in periods subsequent 
to 1975. The years 1975 through 1979 were eliminated from the study as both advertising and 
direct solicitation were banned in these years. Table 1 summarizes and classifies the solicitation 
rules as banned (B) and allowed (A) by year (1980 through 1987) and by state. In banned states, 
all forms of direct solicitation are prohibited. Three additional codes are used: C denotes the year 
of change from banned to allowed, O is used where written direct solicitations are permitted and 
oral direct solicitations are prohibited and X indicates that the direct solicitation rule information 
was not available. Client firms were excluded from the sample for any year in which they received 
one of these three codes. Two samples of clients conforming to the B-T and DHL sampling criteria 
were selected for clients with primary operations in states for years in which the states either 
banned or allowed direct solicitation. Observations were assigned to banned and allowed markets 
using the National Bureau of Standards’ Federal Information Processing Standards (FIPS) state 
code provided by Compustat, a code which designates the principal location of a company’s 
operation. One limitation of our selection process is that we cannot verify how many of the states 
banning solicitation actually enforced such bans. Our B-T and DHL samples are described in the 
following two sections. 


B-T Sample 


We identified clients with audit report information available on the Compustat Annual 
Industrial Tape for years 1980 through 1987. Clients receiving nonstandard reports or disclaimers 
in the prior year or disclaimers in the current year were eliminated to be consistent with the design 
and implementation of the B-T model. Thus, our preliminary sample includes clients receiving 
either a standard report or a “first-time” nonstandard report. We then deleted clients with missing 
SIC code information, regulated clients, financial institutions, and service clients, again to be 
consistent with the sample selection process used by B-T; we also deleted clients with missing 
direct solicitation information, clients in years with achange in solicitation policy, clients in states 
banning only oral solicitation, clients with missing variable information needed to carry out our 
tests, and clients with an estimated probability lower than 0.50 of receiving a nonstandard audit 
report based on the B-T model. Clients are defined as regulated if their four-digit SIC codes are 
between 4,000 and 4,999, inclusive; clients are defined as financial institutions if their four-digit 
SIC codes are between 6,000 and 6,999, inclusive; and clients are defined as service clients if their 
four-digit SIC codes fall between 7,000 and 8,999, inclusive. These deletions were made in the 
order listed. The deletions for the set of clients receiving first-time nonstandard reports are 
summarized in table 2, along with the resulting sample size for each year. In addition to the 183 
first-time nonstandard reports identified by these procedures, table 2 reports a total of 787 clients 
receiving standard audit reports in the current year and prior year, after performing the same 
deletions. 


DHL Sample 


We included only New York and American Stock Exchange (NYSE and ASE) clients in this 
sample, in a manner consistent with that used by DHL. Our initial data source was Compustat 
tapes, and we identified clients with audit report information available on Compustat. As in the 
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TABLE 1 
Summary of State Direct Solicitation Rules: 1980-1987 

State 1980 | 1981 1982 1983 1984 1985 1986 1987 
Alabama X X X X A A A A 
Alaska B C A A A A A A 
Arizona A A A A A A A A 
Arkansas B C A A A A A A 
California A A A A A A A A 
Colorado X O C A A A A A 
Connecticut X X X X X X X X 
Delaware A A A A A A A A 
Florida O O O O O O O O 
Georgia X X X X X O O O 
Hawaii X X X X À A A A 
Idaho B B B B B B B B 
Illinois A A A A A A A A 
Indiana X X X X X A A A 
lowa X X X X X X X X 
Kansas B B B B B B C A 
Kentucky B B B C A A A A 
Louisiana X X X X X X O O 
Maine X X X X A A A A 
Maryland X X X X X X X A 
Massachusetts B B B B B B B B 
Michigan B B B B B B B C 
Minnesota B B B B B B B B 
Mississippi B B B B B B B B 

issouri X X X X X X X A 
Montana B B B B B B B B 
Nebraska B B B C A A A A 
Nevada B B B B B B C A 
New Hampshire X X X X A A A A 
New Jersey A A A A A A A A 
New Mexico B B B B B X X X 
New York X X X X A A A A 
North Carolina A A A A A A A A 
North Dakota X X A A A A A A 
Ohia B B B B B B B B 
Oklahoma X X X X X X X A 
Oregon X X X X X X X A 
Pennsylvania X X X X X X X X 
Rhode Island X X X X À A A A 
South Carolina X X X X X X A A 
South Dakota X X X X X A A A 
Tennessee B B B B B C A A 
Texes B B B B B B B B 
Utah X X X X X X X X 
Vermont B B B B B B B B 
Virginia X X X X X X X A 
Washington B B B C A A A A 
Wes: Virginia X X X X X X X X 
Wisconsin X X X X X X X A 
Wyaming B B B B B B B B 


A = direct solicitation was allowed. 

B = direct solicitation was prohibited. 

C = the prohibition on direct solicitation was removed or partially removed. 
O = oral direct solicitation was prohibited. 

X = infoemation was unavailable. 
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B-T sample, we eliminated clients receiving nonstandard reports or disclaimers in the prior year. 
Our preliminary DHL sample includes clients receiving either a standard report, first-time 
nonstandard audit report or first-time disclaimer for years 1980 through 1987." After deleting 
clients with missing SIC information, we eliminated financial institutions. Note, however, that 
regulated clients and service clients are not deleted by DHL (nor by us in our DHL sample) as they 
were by B-T. We then deleted non-NYSE/ASE clients. To be consistent with the sample selection 
process used by DHL, we deleted clients changing their fiscal year end. Additional deletions were 
made for clients in states where we have no policy information, clients in the year of a change in 
solicitation policy, clients in states banning only oral solicitation, clients with missing variable 
information, and clients with an estimated probability lower than 0.50 of receiving a nonstandard 
audit report based on the DHL model. The deletions were made in the order listed and are 
summarized in table 3. In addition to the 55 first-time nonstandard reports and disclaimers 
identified by these procedures, we included a total of 133 standard audit reports identified by the 
same procedures. 


IV. DATA ANALYSIS 
B-T Sample Analysis 


We test the association between the direct solicitation policy and the auditor’s reporting 
decision for our B-T sample using the following model: 


10 
Y, = ag + J,a Xj + &. 
jal 


The dependent variable (Y) is the type of audit report issued in the current year, where 1 is a first- 
time nonstandard audit report and 0 represents a standard audit report. The independent variables 
(X,) for the logistic model are as follows: 





Variable Definitions 
X,, Direct Solicitation 0/1 dummy variable set to 1 if direct solicitation is allowed in 
client 1’s principal place of business. 
X, Log of Total Assets (Size) Natural logarithm of the book value of total assets at the end of the 
client's fiscal year. 
X, Direct Solicitation *Log of A measure of the interaction of the solicitation policy and client 
Total Assets size. 
X,  Big8 Auditor 0/1 dummy variable set to 1 if the client's auditor in the current year 
is a Big 8 auditor. 
X, RC[NI/(PS + Common Rate of change (RC) in the ratio of net income before extraordinary 
Equity)] items and discontinued operations to the sum of preferred stock 
and common equity. 
AX, RC(inventory/Net Sales) Rate of change (RC) in the ratio of inventory to net sales. 
X,  RC(Receivables/Inventory) Rate of change (RC) in the ratio of receivables to inventory. 
X, RC(CA/CL) Rate of change (RC) in the ratio of current assets (CA) to current 
liabilities (CL). 


(Continued) 


! To verify the Compustat classifications of auditor opinions, we manually examined a sample of the opinions. We found 
the Compustat classifications to be highly reliable. 
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Variable Definitions 
( Continued) 
X,  SI[NI/PS + Common Industry-standardized change in the ratio of net income before 
Equity)] extraordinary items and discontinued operations to the sum of 
preferred stock and common equity. 
X ST[Total Debt/(PS+Common Industry-standardized change in the ratio of the sum of debt in 
7^ A Equity)] current liabilities plus long-term debt to the sum of preferred 
stock and common equity. 


The coefficient of primary interest is that for the direct solicitation dummy variable, which 
should reflect the difference between the two markets, if any, in the likelihood of audit firms 
issuing nonstandard reports. Based on the arguments presented in the economic development 
section, we expect the sign of the coefficient for this variable to be positive. 

Conflicting arguments were presented in the economic development section for the variable 
for client size, measured as the log of total assets. Thus we make no prediction for this variable. 
We include a variable for the interaction between client size and solicitation policy because of 
differences in the information environment of large versus small clients. À potential interaction 
between client size and solicitation policy must be considered in interpreting any differences 
revealed between the banned and allowed markets. As mentioned earlier, if allowing direct 
solicitation provides auditors with additional opportunities to communicate information to the 
client about the alternatives available to them, auditor technical capabilities, industry specializa- 
tion, etc., large clients are more likely to be already aware of these alternatives and the gain in 
information may be greater for smaller, less sophisticated clients. 

The dummy variable for Big 8/non-Big 8 is included to control for the possibility that audit 
reporting decisions differ between Big 8 and non-Big 8 firms. As suggested by Johnson and Lys 
(1990), the literature on differences among audit firms may be categorized into two basic 
explanations: differences in audit technology, specializations, etc. (Eichenseher 1984; Danos and 
Eichenseher 1982) and differences in reputation and audit quality (Chow and Rice 1982; 
DeAngelo 1981b). Johnson and Lys (1990) use relative audit firm size as a proxy for cost structure 
variations, arguing that audit firms with similar production technologies will attract similar 
clients, thus affecting client/auditor alignment. Francis and Simon (1987), Palmrose (1988), and 
Palmrose (1986) argue that reputation differs between Big 8 and non-Big 8 audit firms. Gaver et 
al. (1992) find that the probability of school district audits in California being classified as 
substandard decreases significantly with auditor size. 

The remaining variables were identified by B-T as affecting the likelihood of receiving a 
nonstandard audit report.!? The return on investment variables (RC[NI/(PS + Common Equity)] 
and ST[NI/(PS + Common Equity)]) are intended to capture the probability of recurring client 
losses. A short-term liquidity measure [RC(CA/CL)] is included to indicate possible deficiencies 
in working capital. Variables for inventory intensiveness [RC(Inventory/Net Sales)] and receiv- 
ables intensiveness [RC(Receivables/Inventory)] are included as measures of the degree of 
uncertainty about the recoverability of these assets; their inclusion is based on evidence that 
receivables and inventory are high-risk accounts from a standpoint of auditor exposure to loss. 


2 As defined in B-T, the rate of change in financial statement variables (RC) represents the difference between the ratio 
at the end of the fiscal year and the ratio at the end of the previous fiscal year divided by the ratio at the end ofthe previous 
fiscal year. The industry-standardized financial statement variables (ST) represent the difference between the ratio at 
the end of the current fiscal year and the ratio at the end of the previous fiscal year divided by the standard deviation 
of the ratio across tbe industry for the previous year. 
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The financial leverage variable (ST[Total Debt/(PS+Common Equity)]) is included to indicate 
potential violations of debt covenants. 


DHL Sample Analysis 


Next we test the association between the direct solicitation policy and the auditor's reporting 
decision for our DHL sample using the following model: 


11 
Y, =a + 2 ,2;X, 6. 
jel 


The dependent variable is defined as in the B-T model. The independent variables (X,,) for the 
logistic model are as follows: 


Variable Definitions 
X, Direct Solicitation Q/1 dummy variable set to 1 if direct solicitation is allowed in 
client i’s principal place of business. 


X, Log of Total Assets (Size) Natural logarithm of the book value of total assets at the end of the 
client's fiscal year. 

X, Direct Solicitation *Log of A measure of the interaction of the solicitation policy and client 

Total Assets size. 

X,, Big 8 Auditor 0/1 dummy variable set to 1 if the client's auditor in the current year 
is a Big 8 auditor. 

X, Time Listed (0/1) 0/1 dummy variable set equal to one if the firm has been listed on 
the New York or American Stock Exchange for more than five 
years. 

X, Returns - Industry (percent) Common stock returns (including dividends) minus an equally 
weighted industry index. 

X, ABeta Change in beta, the slope coefficient of the market model regres- 
sion. 

Xg A Residual Standard Change in the residual standard deviation from the market model 

Deviation Returns regression. 

X,  A(Receivables/Total Assets) Change in the ratio of receivables to total assets (including capital- 
ized leases). 

Xj, A (Inventory/Total Assets) Change in the ratio of inventory to total assets (including capital- 
ized leases). 

Aa, A (Total Liabilities/Total Change in the ratio of total liabilities (including capitalized leases) 

Assets) to total assets Gncluding capitalized leases). 


Again, our primary interest is the estimated coefficient for the direct solicitation dummy 
variable, which should reflect the difference between the two markets, if any, in the likelihood 
of audit firms issuing nonstandard reports. We expect the sign of the coefficient for this variable 
to be positive. 

Similar control variables are included as in the B-T sample, and the remaining variables were 
identified by DHL as affecting the likelihood of receiving a nonstandard audit report. The DHL 
model includes four financial statement variables? (change in leverage, change in the ratio of 


P? DHL also included a fifth financial statement variable defined as a 0/1 dummy variable set equal to 1 if the firm 
experienced a net loss in the current year. In our sample, after having eliminated all firms except those with a relatively 
high (greater than .5) probability of receiving a nonstandard report in the current year, we found that all of our firms had 
a net loss in the current year. 
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receivables to total assets, change in the ratio of inventories to total assets, and client size)" and 
four stock market variables. DHL suggest that firms are more likely to receive a nonstandard 
audit report when their financial condition deteriorates not only because of going concern issues, 
but also because auditors are more likely in such cases to consider contingencies of a certain 
magnitude to be material. 


Summary Descriptive Statistics 


Summary descriptive statistics for the independent variables used in our tests are presented 
in tables 4 and 5 for the B- T and DHL samples. Table 4 compares variables for clients receiving 
nonstandard (NSR) and standard reports (SR), while table 5 compares variables for clients in the 
allowed and banned markets. Based solely on these univariate tests, there is not a significant 
association between the direct solicitation policy and the auditors' reporting decisions. For 
example, in the B-T sample as seen in table 4, 65.7% and 63.5% of the clients receiving 
nonstandard and standard reports, respectively, are in the allowed market. The Z-statistic reveals 
no significant difference in these proportions. However, univariate tests reveal an association 
between independent variables with both the solicitation policy (table 5) and the auditor's 
reporting decision (table 4), indicating the need for multivariate tests. 

For the B-T sample, significant differences are reported between the NSR and SR sample 
means (table 4) for client size (log of total assets), RC[NI/(PS + Common Equity)], RC(Inventory/ 
Net Sales), RC(CA/CL), ST[NI/(PS + Common Equity)], and ST[Total Debt/(PS + Common 
Equity)]. For the DHL sample, significant differences are reported between the NSR and SR 
sample means (table 4) for industry returns, A(residual standard deviation returns), A(inventory/ 
total assets), and A(total liabilities/total assets). Table 5 reveals significant differences between 
the allowed and banned sample means (proportions) for client size and the Big 8 variable for both 
the B-T and DHL samples. 

We also computed descriptive information regarding average auditor tenure in the banned 
and allowed markets. When Chaney et al. (1993) examined the association between direct 
solicitation and client-auditor realignments for clients within the Big 8 market, they found 
evidence of significantly more frequent auditor switches in the market allowing direct solicitation 
than in the market banning direct solicitation, after controlling for differences between clients in 
the two markets which might affect the number of client-initiated switches in the respective 
markets. An examination of the auditor tenure characteristics of the samples used in the current 
paper yields similar results. The descriptive information on tenure is not included in tables 4 and 
5, as the samples of firms used for the tenure comparison differ from the samples used in our other 
tests. Our data seurce, Compustat, codes Big 8 auditors individually but codes all non-Big 8 
auditors the same. Thus, if a firm switched from one non-Big 8 auditor to a different non-Big 8 
auditor, we would be unable to detect the change. For samples of clients audited by Big 8 auditors, 
we found that average tenure was significantly longer in the banned market than in the allowed 
market. This finding is consistent with the arguments presented in the economic development 
section, as well as with the earlier research referenced above. 


^ As defined in DHL, the change in financial statement variables represents the difference between the book value of the 
variable at the end of the current fiscal year and its book valuc at the end of the previous fiscal year. 

5 As defined in DHL, returns are estimated over 260 trading days prior to the fiscal year end. Observations are eliminated 
if there are fewer than 100 trading days available in that period. The change in market variables (A) represents the 
difference between the variable measured over the 260 trading days prior to the fiscal year end and the 260 trading days 
prior to the previous fiscal year end. Observations are eliminated if there are fewer than 100 trading days available in 
each fiscal year. Market model regressions are based on the equally weighted NYSE and AMEX Index. 
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Estimation and Results 


The results of the logit analysis are presented in tables 6 and 7 for the B-T and DHL samples 
respectively. As predicted, the coefficient for the direct solicitation dummy variable is positive 
and significant at the 0.01 level for both the B-T and DHL samples. This variable reflects the 
difference between the two markets in the likelihood of audit firms issuing nonstandard reports, 
suggesting that firms in the allowed market are more likely to receive a nonstandard audit report 
than firms in the banned market in a sample of clients where nonstandard reports are likely to be 
appropriate. These results are consistent with the arguments that the greater flow of information 
and shorter tenure in the allowed market lead to increased levels of independence and/or 
competence. 

The sign of the size coefficient is mixed between samples, positive and significant in the DHL 
sample, negative but insignificant in the B-T sample. Based on previous evidence presented by 
Chow and Rice (1982), we might have expected this coefficient to be negative, suggesting that 
smaller clients are more likely to receive nonstandard reports than large clients. However, in this 
paper we focus on categories where the probability of receiving such a report is relatively high. 
Thus, it appears that findings are mixed when nonstandard reports are likely to be warranted. A 
possible explanation for a positive coefficient for the size variable is that auditors, always aware 
of the potential losses resulting from audit report errors, are aware that the likelihood of discovery 
(and the concomitant cost) is greater on average for large, highly visible clients. 

The sign of the coefficient of the size-solicitation interaction variable is negative and 
significant for both samples. This suggests that as the size of the client increases, the association 
between auditors' reporting decisions and direct solicitation policy weakens. Thus, for clients of 
a certain size, there is no difference between the banned and allowed markets. For still larger 
clients, however, the sign of the direct coefficient actually reverses, suggesting more nonstandard 
reports in the banned market. 

To consider the possible significance of this reversal, we ran the regression separately on 
relatively large clients (more than $120 million in total assets) and again on relatively small clients 
(less than $120 million). For the sample of large clients we found that the sign of the direct 
solicitation variable was negative (as expected from our regression and from the discussion 
above), but it was not significant at any reasonable level (p-value of 0.68). In contrast, when we 
ran the regression on relatively small clients, the solicitation coefficient was positive and 
significant at the 0.01 level. We altered the size cutoff with similar results. These results suggest 
that the reversal is not a significant effect; that is, there are not significantly more nonstandard 
reports in the banned market among large clients (no significant difference), whereas there are 
significantly more nonstandard reports in the allowed market among smaller clients. 

While the coefficient of the Big 8/non-Big 8 dummy variable is positive in both samples, it 
is not significant in either. Thus, we conclude that the reporting decisions of Big 8 and non- 
Big 8 auditors do not differ significantly in our samples. This conclusion is tempered by noting 
that there is not a great deal of variation for firms in our samples for this variable. Over 70 percent 
of firms in the B-T sample, and over 90 percent of firms in the DHL sample, had Big 8 auditors. 

Ofthe other control variables included in the B- T sample, only ST[Total Debt/(PS4-Common 
Equity)] and ST[NI/(PS + Common Equity)] are significant at conventional levels. Similarly, for 
the DHL sample, only the change in the ratio of inventories to total assets and the change in 


lF*The logit models were identical to those used in tables 6 and 7. Based on a cutoff of $120 million, we labeled 
approximately 82 percent of our B-T sample as “relatively small” and 55 percent of our DHL sample (which consisted 
of larger firms overall). 
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TABLE 6 


Results of Estimating the Logistic Model Used to Test Differences Between Banned 
and Allowed Markets 


10 
Y, =a) + 3 a,X, +£, where: 
l 


m 
Y, is the type of audit report 


Bell- Tabor Sample 

Variable Name (X p Coefficients Chi-square 
Constant —1,866** 37.001 
Direct Solicitation (0-banned/1-allowed) 0.755** 5.586 
Log of Total Assets (Size) -0.021 0.076 
Direct Solicitation *Log of Total Assets -0.270** 7.446 
Big 8 (0-non-Big 8/1-Big 8) 0.257 1.510 
RCINI/PS + Common Equity)] -0.000 0.712 
RC(Inventory/Net Sales) 0.006 0.614 
RC(Receivables/Inventory) 0.001 0.704 
RC(CA/CL) -0.2779 2.271 
ST[(NI/(PS + Common Equity)] —-0.062** 21.718 
ST[Total Debt/(PS--Common Equity)] -0.103** 20.719 

Sample Size: NSR = 183, SR = 787 

Model R Statistic 0.230 

Model Significance 0.0001 

Concordant Pairs 69.2% 


**Significant at the 0.01 level. 
* The dependent variable (Y) is the type of audit report issued in the current year, where 1 is a first-time nonstandard audit 
report and 0 represents a standard audit report. The independent variables are defined above. 


leverage variables are significant. The lack of significance of the other variables is not surprising 
since our sample consists of clients already identified as having a probability greater than 0.50 
of receiving a nonstandard audit report. Thus, although the stock market variables included in the 
DHL model are not significant in our logit analysis, they were likely valuable in determining the 
clients’ inclusion in the sample.” 

The significance of the logistic model is tested by a statistic computed as negative two times 
the model log-likelihood ratio distributed as a Chi-square with the degrees of freedom equaling 
the number of independent variables. As shown in tables 6 and 7, model R statistics that are similar 
to multiple correlation coefficients in a normal regression are .23 and .47 for the B-T and DHL 
samples respectively. Additionally, tables 6 and 7 report the percentages of concordant pairs 
(where a pair consists of one nonstandard report observation and one standard report observation) 


7 Sectións of both the DHL and B-T papers are devoted to evaluating the predictive accuracy of the estimated models in 
terms of the misclassification costs for various type I and type II errors. See those papers for additional details. 

I*'The R-statistic is computed as the square root of [(The model Chi-square less 2 times the number of independent 
variables)/(-2 times the maximum log-likelihood with only the intercepts in the model)] (SUGI Supplemental Library 
User's Guide, 1983 edition, p. 183). 
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TABLE 7 


Results of Estimating the Logistic Model Used to Test Differences Between Banned 
and Allowed Markets 


11 
Y, =a, t Y aX, t €,, where : 
jal 


Y, is the type of audit report* 
Dopuch, Holthausen, and Leftwich Sample 


Variable Name (Xij) Coefficients Chi-square 
Constant —7,391** 17.595 
Direct Solicitation (0-banned/1-allowed) 3.891** 6.832 
Log of Total Assets (Size) 0.634** 8.293 
Direct Solicitation *Log of Total Assets —0.725** 6.907 
Big 8 (0-non-Big 8/1-Big 8) 0.749 1.214 
Time Listed (0-less than 5 yrs/1-otherwise) —0.210 0.197 
Returns - Industry (percent) -1.218 1.646 
A Beta 0.333 0.739 
A Residual Standard Deviation 0.233 1.555 
Returns 

A (Receivables/Total Assets) —0.400 0.011 
A (Inventory/Total Assets) —7.110* 4.203 
A (Total Liabilities/Total Assets) 9.774** 20.775 

Sample Size: NSR = 55, SR 2133 

Model R Statistic 0.467 

Model Significance 0.0001 

Concordant Pairs 85.20% 


** Significant at the 0.01 level, * significant at the 0.05 level. 
* The dependent variable (Y ) is the type of audit report issued in the current year, where 1 is a first-time nonstandard audit 
report and 0 represents a standard audit report. The independent variables are defined above. 


for both the B-T and DHL samples.'? The percentages of concordant pairs are 69.2% and 85.2% 
for the B-T and DHL samples respectively. 

In summary, it appears that auditors in the allowed market are more likely than those in the 
banned market to issue a nonstandard report when such a report is warranted. However, the effect 
of solicitation on the auditors' reporting decisions weakens as the size of the client increases. This 
result is consistent with an argument that solicitation policy has more of an effect on the auditor's 
reporting decision among smaller clients, because their information environment is less sophis- 
ticated on average. In the next section we consider alternative explanations for our findings. 


19 A pair of input observations with different responses is said to be concordant if the larger response has a lower predicted 
event probability than the smaller response, where an event response is defined as the response whose ordered value 
is 1. 
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Rural and Other Effects 


One such alternative explanation is suggested by the argument advanced in the economic 
development section that auditors of small clients are more likely to be earning quasi-rents from 
consulting and hence more tempted to sacrifice independence. If the states which prohibited 
solicitation during the test period were more rural than those which allowed it, then our results 
could be driven by differences in the nature of banned vs. allowed states (urban vs. rural). To 
consider this possibility, we repeated our tests, excluding clients in states identified as "rural" 
based on data obtained from the U.S. Bureau of the Census. We obtained ranking of states by 
percentage of population urban, percentages metropolitan vs. non-metropolitan, population per 
square mile, and land in rural areas. Based on these rankings, we eliminated the clients in the 20 
states with the lowest rankings in percentage urban and percentage metropolitan from our 
analysis. For each of these 20 states, less than 65 percent of the area was designated metropolitan 
by the Bureau.” The results of these analyses were consistent with those presented in tables 6 and 
7, providing assurance that the observations from rural states were not unduly influencing the 
findings. We also repeated our analysis with all clients included with the addition of a dummy 
variable set equal to one for rural states and zero for other states. Tbe coefficient on this dummy 
variable was negative for both samples but not significant at any reasonable level. The other 
coefficients in the model were unaffected by the addition of the rural variable. We varied the 
number of states labeled rural and the type of ranking selected, with similar results. 

In the event that our sample selection techniques did not totally control for clustering of audit 
reports by industry or by year, we also performed tests with additional dummy variables for year 
and for industry in our model. The results of such tests were virtually identical to those without 
the additional variables and are not presented here. Finally, we eliminated all the variables from 
the B-T and DHL models from our regressions and still obtained the same levels of significance 
for the solicitation variables and the size-solicitation interaction variables. 

Although we find no evidence that our results are unduly influenced by differences between 
banned and allowed markets in those characteristics considered (such as urban vs. rural or 
industry concentrations), the possibility remains that other differences between banned and 
allowed states might be influencing the results. For example, states which allowed solicitation 
during our test period might be more progressive in their auditing standards and require a higher 
standard of performance than those which banned solicitation. The possibility of some correlated, 
omitted variable should not be ignored. 


V. SUMMARY AND CONCLUSIONS 


This study examines the question of whether direct solicitation rules affect auditors' 
decisions regarding the type of audit report to issue. The evidence suggests that auditors are no 
less likely to issue nonstandard audit reports in markets allowing direct solicitation than in 
markets banning such solicitation. To the contrary, among relatively small clients, results indicate 
that auditors are more likely to issue nonstandard reports in the allowed market in samples where 
the probability of receiving such a report is estimated to be relatively high. 


*The 20 states labeled as most rural were, in alphabetical order, Alaska, Arkansas, Idaho, Iowa, Kansas, Kentucky, 
Maine, Mississippi, Montana, Nebraska, New Hampshire, New Mexico, North Carolina, North Dakota, Oklahoma, 
South Carolina, South Dakota, Vermont, West Virginia, and Wyoming. 
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Thus, the results of this research reveal a lack of support for the contention that auditors allow 
their independence.to be compromised or their audit effort to be reduced due to competitive 
pressures in the allowed market. Instead, the results support the views of opponents of bans, who 
suggest that more information is available to clients when direct solicitation is allowed, and that 
this improved information environment should have a positive effect on the audit market. Our 
tests do not distinguish whether the association between the auditors’ reporting decisions and 
solicitation policy is attributable to greater independence or to greater competence in the allowed 
market. Our findings are consistent with economic arguments predicting increases in the levels 
of both independence and competence in the allowed market. Further, the evidence suggests that 
' the effects of allowing direct solicitation decline as client size increases. 

While solicitation is currently permitted in virtually every state, these findings are of 
importance for at least two reasons. First, although solicitation is almost universally permitted, 
belief in its desirability is far from universal. Second, even if future changes do not occur in 
solicitation policy, these findings have implications regarding the effects of competition on 
independence and competence, an issue which surfaces in many forms. In general, our results 
suggest that a more competitive environment, such as the allowed market, does not induce a 
lowering of auditor independence or competence and may, in fact, lead to improved levels. 
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ABSTRACT: We demonstrate that when cost differences among CPA firms serve 
as a source of economic rents to the incumbent auditor, the switching costs 
previously cited as the source of the auditors' rents may actually reducethe auditors' 
economic rents to the benefit of the client. This result has implications for how 
switching costs affect the way audit engagements are structured and how clients 
invest in their relationships with auditors. While the resulting behavior may appear 
to be inefficient or of a suspicious nature, it is a natural consequence of imperfect 
competition. This behavior includes (i) clients under-investing in their accounting 
systems, (il) cllents accepting their current auditor s management advisory services 
(MAS) bid, even though a rival CPA firm has submitted a lower bid for identical MAS, 
and (iii) inefficlent same sourcing for MAS and audit services when CPA firms treat 
their audit and non-audit divisions as separate profit centers. 


Key Words: /mperfect competition, Audit markets, Audit pricing, Low-balling, 
Auditor switching. 


I. INTRODUCTION 


HIS paper changes the focus of previous audit pricing models, such as DeAngelo (1981, 
1982, and Magee and Tseng (1990), from ex ante perfect competition to imperfect 
competition by explicitly modeling cost differences across auditors and the possible 
deterioration of an incumbent auditor' s competitive advantage. Our model recognizes that clients 
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change their auditors for a variety of reasons! by representing an audit client’s characteristics as 
a random variable which changes over time. Auditors do not have identical technologies,” and 
consequently, auditor-client matches which are cost minimizing at one point in time may not be 
so later on. We show that when there are alternative sources of economic rents to the auditor, such 
as cost differences, the switching costs previously cited as the source of the auditors’ rents may 
actually reduce the auditors’ economic rents to the benefit of the client by increasing the discounts 
offered by auditors. This result provides a new perspective on the effects of switching costs which 
has strategic implications for the structuring of audit engagements and how clients invest in their 
relationships with auditors. 

The first implication we note is that clients may intentionally reduce their accounting 
expenditures in the period of a switch if these reductions increase the new auditor’s switch costs. 
This apparently inefficient behavior is a natural consequence of imperfect competition, serving 
to reduce the client’s overall audit costs by increasing the competitive advantage of an inefficient 
incumbent auditor who is bidding against the new replacement auditor. The additional compe- 
tition increases the discount that the new auditor must grant in order to obtain the client. 

The next implication provides a new perspective on the debate concerning the purchase of 
audit and management advisory services (MAS) from the same CPA firm.? With ex ante perfect 
competition, MAS prices are determined independently of activity in the audit market, but 
imperfect competition leads to an interdependence in the pricing of these items. As a result, 
whenever the client accepts the current auditor’s MAS bid, she does so even though the competing 
CPA firm has submitted a lower bid for identical MAS. In other contexts, an MAS award of this 
nature might be viewed with suspicion, but these results are derived in a setting where the 
independence of the auditor is not in question. 

The final implication we examine concerns CPA firms that require their divisions to price 
independently. Such “independent pricing policies” have been cited as effective devices for 
encouraging auditor independence when a CPA firm provides both the audit and MAS (see for 
instance Leibman and Kelly, 1992, 436—437). We show that these policies may create an 
inefficient demand for same sourcing under imperfect competition. While independent pricing 
policies produce certain benefits outside the scope of this paper (e.g., possibly a more independent 
audit^), and these benefits appear to come without cost in an ex ante perfectly competitive audit 
market, the illustration identifies a negative aspect in imperfectly competitive markets. 

By permitting client characteristics to change randomly over time, we explicitly create a 
demand for auditor switches. Consequently, we do not consider other sources of demand for 
auditor switches. These include the case where (1) neither party knows its relation-specific costs 
but learns them over time, and (ii) there exists asymmetric information at the time of the switch. 
A fairly extensive literature has developed for case (1) describing settings such as employer- 
employee relationships where each party learns the other party's characteristics only after the 
relationship has been established (see for example, Mortensen, 1988). These settings are 
characterized by a learning period after which a separation takes place whenever the two parties 
are sufficiently mismatched. Kanodia and Mukherji (1994) examine a variant of case (11) to 


! Examples might include the acquisition of new business or product lines, relocation or downsizing. 

2 Danos and Eichenseher (1982, 1986), Eichenseher and Danos, (1981) Cushing and Loebbecke (1986), and Kinney 
(1986) present evidence suggesting that large accounting firms employ different technologies. Johnson and Lys (1990) 
argue that client-auditor realignments represent efficient responses to changes in client operations and activities over 
time. 

3 See Antle and Demski (1991) for a review of this debate. 

* See Leibman and Kelly (1992, 436—437). 
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demonstrate that audit switches can endogenously occur as a response to the incumbent auditor's 
superior knowledge of the auditing cost. Dye (1991) and Teoh (1992) examine another variant 
of case (ii) and demonstrate that a client who knows the auditor to be in error will switch auditors 
in the hope of finding anew auditor whois more likely to be correct. Unlike cases (1) and (ii), where 
switches become increasingly rare through time, our model incorporates an ever present 
possibility of change, which may make it more suited to dynamic analysis over extended periods 
of time. 

The remainder of the par. r contains five sections. Section II presents the pricing model and 
Section III the resulting equilibrium. Section IV examines price cuts and low-balling. Section V 
illustrates the strategic distortions introduced in an imperfectly competitive audit market by 
comparing client firm demand for internal auditing and MAS in perfectly competitive and 
imperfectly competitive audit markets. Section VI contains a discussion and summary. Proofs of 
'all results are contained in the appendix. 


II. THE BASIC PRICING MODEL 


The basic pricing model presented in this section is intended to capture salient economic 
features of an audit market with heterogeneous producers. Section V extends the basic model to 
study the impact of this market on the client's demand for internal auditing and the markets for 
other audit-related services. Following Magee and Tseng (1990), the audit market is characterized 
by price competition between two audit firms, designated auditors J and 2, for a single client. 
Audit firms set prices knowing the characteristics of the client in each of N discrete periods. Upon 
changing auditors, the client firm incurs a one-time switching cost, c20, representing the start up 
costs associated with the new auditor. Similarly, 4 denotes the switching cost incurred by any new 
auditor in the first period of an engagement. Both c and Z are required only in the first period of 
a relationship and can be viewed as learning costs. However, after a switch from auditor i to 
auditor j, a reverse switch (from auditor j to auditor i) is assumed to cost the client c and auditor 
i 4 — learning completely decays after the passage of one period if it is not used in the specific 
relationship for which it was developed (i.e., rapid forgetting). The auditor who audits the client 
in period t~/ is referred to as the period t incumbent auditor and the auditor who did not audit the 
client in period 1-7 as the period t rival auditor. Finally, let represent the present value of a dollar 
to be received in the next period. 

Our pricing mode] departs from Magee and Tseng's (1990) model in that, exclusive of 
switching costs, our auditors are assumed always to have different incremental auditing costs. Let 
the cost borne by auditor i, i=/, 2, in period t be-V/ + £ if auditor i is the rival auditor in period 
t and V/ if he is the incumbent auditor in period t, where V! > 0 is the realization of a random 
variable described below. V/ summarizes the effect of all economically relevant client and 
auditor characteristics. At the beginning of each period f, both random auditing costs, V’, i=J, 
2, are publicly realized. This assumption is representative of a setting in which everyone knows 
who the best auditor is for a specific client, as might occur when audit firms establish reputations 
for expertise in particular industries, geographic locations, or for any other readily observable 
characteristic. For exposition, we assume that V’ can take one of two possible values H and L, 
with H > L.? To focus the analysis on the impact of switching costs on alternative sources of 
economic rents, we further assume that, for any period, one of the auditors has cost V! = H and 


5 Technically, the cost realizations can be private to each auditor when V; can only take on two values, as is the case in 
our analysis. However, this is very special to our model. 
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the other has cost V = L, although which auditor has cost H (L) changes randomly over time.5 
The process generating these random changes is summarized bv the transition matrix: 


Period t+] 


(LH) (H,L) 
(LH) | (Q-A) À 





Period t 


(H,L)| à (I-A) 


where A represents the probability that the auditor with cost L (H) in period t changes to H (L) in 
period t+J and the cost realizations are denoted by the ordered pairs (VJ, V?). It is assumed that 
H-L>4+c, so auditor switches occur in equilibrium if and only if the identity of the auditor 
with cost L changes.’ Accordingly we will refer to the period t auditor with cost L (H) as the low 
(high) cost auditor. The assumption that auditors will not have identical costs in any period and 
the implication that switches occur if and only if the identity of the efficient auditor changes are 
instrumental in establishing closed-form solutions to our dynamic pricing model. Finally, we 
assume that a change of efficient auditor in any given period is not the more likely event, i.e., 
À < 1/2. 


III. EQUILIBRIUM 


At the beginning of each period the auditors and client learn the identity of the low cost auditor 
by observing V/ and V?. Next auditors 7 and 2 bid for the period f audit.* Let: 


pr the period t bid by an incumbent who is the period t low cost auditor 


p#® = the period t bid by a rival who is the period t high cost auditor 
p# = the period t bid by an incumbent who is the period 7 high cost auditor 
pH = the period t bid by a rival who is the period t low cost auditor. 


The auditors' equilibrium bidding strategies and the client' s equilibrium choice of auditor are 
characterized through backward induction. To begin the backward induction, consider first the 
last period, period N, where Lemma 1 defines the equilibrium strategies. 


6 For an analysis of the case where auditors always have the same variable cost see Magee and Tseng (1990). 

Tif H-L>2+c and the identity of the efficient auditor changes, the savings in audit costs in the current period alone 
are enough to overcome all costs of switching auditors. Therefore, in equilibrium, the client will switch anditors 
whenever the identity of the efficient auditor changes. A technical problem arises when H ~L < 4+ c in that switches 
may still occur when the identity of the low cost auditor changes, even though the cost savings in the current period alone 
are not sufficient to justify a switch, provided it is unlikely that the new auditor vill lose its cost advantage soon. Then 
it may pay for the clientto switch immediately, and by accumulating several periods of cost savings, overcome the costs 
of switching auditors. This means that when H —L < £+ c, aswitch depends on the size of A and the magnitudes of H, 
L, 4 and c. We examine the case where H — L > ¢+c for tractability. 

* In equilibrium, the high cost auditor will never win the audit, and consequently he has no incentive to actually submit 
a bid. However, the mere presence of the high cost auditor forces the low cost auditor to follow the equilibrium pricing 
strategies we derive, for otherwise, our results imply that the client will request the bid we specify for the high cost 
auditor and the low cost auditor will lose a profitable engagement. 
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Lemma 1: The period N equilibrium bids are: 


piR=H+l 
p] =H+l+e 
when the period N low cost auditor is incumbent, and 
py =H 
py = H-c 


when the period N high cost auditor is incumbent. In either case, the low cost 
auditor wins the engagement. 


Upon switching auditors, the client must pay c. Therefore, a low cost incumbent auditor can 
charge c more than the high cost rival auditor and still retain the client. But if the high cost auditor 
is incumbent, the low cost rival auditor must charge at least c less than the incumbent in order to 
take the client away. Note that it is the high cost producer’ s cost which determines the equilibrium 
prices in Bertrand equilibria with differential costs of production.? 

Lemma 1 shows that the period N low cost auditor receives fees of pi! = H +£ +c in period 
N if he is incumbent in that period, but only pẹ = H-c in fees less a learning cost of / if he is 
the period N rival. This provides an ex post value of incumbency to the auditor who turns out to 
be the period N low cost auditor of pH — (pi^ + £) = 2(4 4 c). Since there is a A probability that 
the period N-Z high cost auditor will become the period N low cost auditor, his expected value 
of incumbency is A2(/ + c). Similarly, the period N—J low cost auditor has an expected value of 
incumbency of (1— A)2(4 + c), since there is a(J — A) probability that he will win the period N audit. 

As the next step in the backward induction, consider the next-to-last period, N—J, at which 
time the client and both auditors assess the impact of their decisions on the current and last period. 
Notice from Lemma 1 that if the period N—J auditor becomes the period N low cost auditor, the 
client will retain that auditor in period N and pay audit fees of pf = H + £+ c in that period. By 
contrast, if the period N—/ auditor becomes the period N high cost auditor, the client switches 
auditors and pays pẹ = H —c in fees along with the switch cost of c, for a total period N audit 
cost of only H. Since the equilibrium cost of hiring the rival auditor is 4+c less than the 
equilibrium cost of hiring the incumbent, the client saves £+c whenever a switch occurs. 
Consequently, the expected period N costs would actually be lowerifthe client hired the high cost 
auditor in period N—7 because a switch is more likely in period N—7 if the client hires the high cost 
auditor in period N—/. So, to win the audit engagement in period N—/, the low cost auditor must 
compensate the client for the higher expected period N costs through a price reduction in period 
N-1.Lemma 2 shows that for this reason itis no longer sufficient for the low cost auditor to simply 
add or subtract c from the rival’s period N—/ bid to win the audit, as was the case in period N. 


Lemma 2: The period N-1 equilibrium bids are: 
pæ, -He4|-ÓA2Z(L4 c) 
pi, = H+(1-8)(t+c) 
when the period A—/ low cost auditor is incumbent, and 
pE,-H-óA2(44c) 
pk, =H-&l+c)-—c 
when the period N—/ high cost auditor is incumbent. In either case, the low cost 
auditor wins the engagement. 


? When there are more than two producers it is the second lowest production cost. Since only the most efficient and the 
second most efficient producers are relevant to the equilibrium, the other producers are typically ignored. 
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The high cost rival and the high cost incumbent both bid to break even by submitting bids 
equal to their respective costs of conducting the period N—/ audit minus their discounted expected 
value of incumbency. In response, to make the client indifferent, the low cost incumbent auditor 
adds c to the high cost rival’s bid and subtracts the discounted expected difference in period N 
audit costs discussed above. 

We hypothesize for the purpose of induction that the results of period N—7 are also true for 
the next N periods going backward, call these periods t+J through N—J. (Note that both the 
auditors’ and the client’s strategies are part of the inductive hypothesis.) From this hypothesis we 
prove that the equilibrium strategies of period f are the same as the hypothesized equilibrium 
strategies, thereby completing the proof by induction and establishing that the equilibrium - 
strategies are identical in periods 2 through N—/ (the first period' s bids are different because there 
is no incumbent in the first period).!° 

Given the induction hypothesis, players choose their period £t actions to maximize the 
discounted expected value of all remaining period payoffs. An important feature of our model is 
that only payoffs in periods t and t--/ are relevant to the auditors in determining their period : 
strategies. In the hypothesized equilibrium, the low cost auditor always wins the audit and his 
profit is determined by whether or not he won the audit in the previous period. But since bids 
cannot affect future cost realizations, any bid made in period £ can not possibly affect the 
equilibrium payoffs in periods t+2 through N. 


Proposition 1: The equilibrium strategies for the two auditors and the client are: 


p! =H+£—-6A%L+c) 
pL -H-et-ó(t*c) 

in the first period, 

pF -Het-óAXt-0c) 
p4 = H+(1-6)(£+¢) 
p! = H-6A2(L+c) 

pe -H-ó(t*c)-c 

in periods t with 7 « t « N, 
př=H+£ 

pu =H+2+c 

py =H 

p? =H-c 

in the last period, and the client hires the low cost auditor in every period. 


Again the high cost rival and the high cost incumbent both bid to break even by submitting 
bids equal to their respective costs of conducting the period z audit minus their discounted 
expected value of incumbency. 

Only periods t and t+/ matter in period t, so the expected value of incumbency to the 


period t high cost auditor is: 
Al Pie - Gi - 01 
—- A([H  (1—-6)(t- c)) -[H - (1o 0)(019- c)]] 
= ÀA2(44- c), 


10 Although less common, it is acceptable to hypothesize that a theorem holds for the first n cases, rather than just the nth 
case, to prove the theorem by induction. See for instance James and James (1976) Mathematics Dictionary. 
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justas in period N—7. And because the cost of conducting the period t audit is the same as in period 
N—1, the high cost auditor’s bids are the same in period t as they are in period N—1. 

To make the client indifferent between auditors, and thereby win the engagement, the period 
t low cost auditor must consider the client’s desire for switches in the same way as the period 
N-1 low cost auditor does. The probability of an auditor switch is time independent (i.e., A is 
constant) and cost realizations are independent of incumbency per se, so the client's choice of 
auditor in any period only affects the probability of a switch in the next period. Therefore, the 
client's preference for the high cost auditor in period tis equal to the difference in expected period 
t+] audit costs: | 


[(1— A)pE, + AGES 0) - IApH, * (1— AX pik +c)] 
= (1- 24) piss — Pri — €) 
=(1-—2A)(4 +c), 


just as in period N—J. Finally, the low cost rival either adds or subtracts c, depending on 
incumbency, and subtracts the discounted expected difference in future audit costs from the high 
cost incumbent’s bid to make the client indifferent between auditors. As a result, the low cost 
auditor's bids are also the same in period ¢ as they are in period N-7. 


IV. PRICE CUTS AND LOW-BALLING 


In this section we show the fundamental differences between the predictions of our model and 
the ex ante perfect competition model of Magee and Tseng (1990) through a comparison of the 
equilibrium audit fees. These differences arise in the discount or “low-ball” an auditor is willing 
to absorb in the first period of an audit relationship in order to gain a new client. While there are 
different interpretations of “price cuts" and “low-balling,” we use the language of Magee and 
Tseng who define a price cut as the difference between the second and first period fees charged 
in an ongoing relationship and a low-ball as the period cost of an audit less the period price.! 

Recall that the auditor with the lowest period t cost will audit the client in period t. Therefore, 
from Proposition 1, in the first period the auditor charges H + / — ó(4 4- c), and if he is retained 
in period two he charges H 4 (1—6)(4- c). So the price cut for a previously unaudited client 
equals c. In any subsequent period, an auditor will charge H — ó(4 + c) - c inthe period he obtains 
the client and H - (1— 0)(4 * c) for every period he remains the incumbent. Therefore, the price 
cut equals £+ 2c in the period of an auditor switch. 

Magee and Tseng’s results imply a price cut of c whenever the client obtains a new auditor, 
and they argue (p. 320) that “the first period price cut observed by Simon and Francis (1988) 
should be correlated with the client's costs of switching to a new auditor, not with the auditor's 
learning cost." Note however, that Magee and Tseng's result applies only to the difference 
between the fees of the first and second periods, because that is the only time the client firm hires 
a new auditor in their model. Our analysis indicates that even when auditors are heterogeneous 
and the identity of the efficient auditor is permitted to change, the first period discount still equals 
c. However, this discount only applies to clients who have not been previously audited. For 
auditor switches, the price cut increases from c to £+ 2c. 


V This definition is empirically motivated. DeAngelo (1981) defines a low-ball as the difference between the initial audit’ s 
cost and fee. 
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For a previously unaudited client, neither auditor has the advantage provided by switching 
costs, and the only competitive difference is in their incremental costs, V/. But for a previously 
audited client, the switching costs give the incumbent auditor a marginal competitive advantage 
of £--c over the rival auditor. A high cost incumbent auditor will bid away this marginal 
advantage in a futile attempt to retain the audit. As a result, a low cost auditor's bid is lower by 
£+c in periods of an auditor switch. l 

As a note, government audits present an opportunity to test whether price cutting accompa- 
nying auditor switches is greater than discounts associated with a previously unaudited program 
since new government programs are not unusual and fee data is available. The possibility that 
auditors are being switched in accordance with rotation requirements will complicate the 
analysis, in that the auditee’s characteristics may not have changed in accordance with our model. 
Also, with mandatory rotation, even if the auditee’s characteristics have changed, the ability of 
the auditee to use the current auditor to extract a higher price cut from the auditor may be lost, and 
discounting may more closely resemble that found in a previously unaudited program. 

The auditor extracts rents attributable to his ability to provide lower cost audits to the client 
than his rival (sometimes referred to as “price gouging”). However, the competitive price cut 
counters this effect by turning over the present value of these future quasi-rents to the client in the 
form of an introductory discount. Because the quasi-rents, and therefore the price cut, increase 
with the level of switch costs, we show next that increasing switch costs can strictly decrease the 
expected present value of audit costs to the client when auditors are imperfect competitors ex ante. 
To see this, we calculate the present value of future audit fees as N becomes large (making N large 
simplifies the expressions). First define r as the one period discount rate, which is equal to 


(1—6y 6. 


Proposition 2: As N becomes large, the client's expected present value of audit costs 
decreases with A and decreases with c. Furthermore, the expected present value of audit 
costs decreases (increases) with / whenever r< A (r>A) and does not change with Z 
when r—A. | 


Intuitively, the benefit of switch costs to the client in the imperfectly compétitive case comes 
from reducing the inefficient auditor’s cost disadvantage, making the two auditors more 
competitive. When the incumbent is the high cost auditor, the rival must bid away the marginal 
advantage provided to the incumbent by the switch costs in order to win the engagement. The low 
cost rival auditor is willing to fund this welfare transfer to the client from his competitive 
incremental cost advantage. In contrast, switch costs do not reduce audit fees in the ex ante 
perfectly competitive audit market because the rival auditor is never competitive enough to drive 
the incumbent’s bid down to the point where he gives up the advantage provided by the switch 
costs. 


V. STRATEGIC EFFECTS 


Chent Investment 


The internal audit profession has argued that investing in financial reporting system 
improvements or internal auditing systems which reduce the external auditors’ costs can be a cost 


2To modify Proposition 2 for any finite case, simply replace r in the statement of the propositions with r(1 — 8*!), which 
approaches r as N— <0, 
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effective way to execute total audit coverage (e.g., Berry, 1985). This section shows how an 
imperfectly competitive audit market can distort this type of investment behavior. 

Assume that after observing (V.!, V^) aclient can choose to make a one-time expenditure of 
e dollars which reduces each auditor's variable cost by an equal amount. Suppose that e also 
reduces the learning cost of the new auditor in the case of an audit switch as would occur when 
expenditures produce better documentation of the system, which in turn, reduces learning time 
or, in the case of internal auditing, when the expenditures reduce the scope of the external audit 
and the learning cost of the new auditor is increasing in the scope of the audit.? These cost 
reductions are assumed to persist over the current period, t, and k future periods. Let #,,, (e) 
represent the learning cost of the period t+m rival, and H,(e) and L,,, (e) represent the 
incremental cost for the period t+ high and low cost auditor, respectively. Specifically, let: 


_ [£— fte) fS msk (la) 
tue. if m > k 

_| H-wle) ifO0smsk (1b) 
Hae) | He peek 

_| L-wle)if0smsk 1 
Inal) =] 5 ifm>k ( c) 


where ffe) and w(e) are increasing and strictly concave, f(e) < * and w(e) < L'. L^, H* and £* 
represent the “base line” costs which would result with no expenditures. 

Switching costs enter prices both through the current period’s costs and through the value of 
incumbency, which is a function of next period’s costs. Restating the equilibrium prices with the 
appropriate period subscripts so that we can determine the effect of each particular period’s 
learning cost on each period’s prices for 1<t<N gives: 


Lemma 3: The equilibrium bids are: 


p: (e) = H,(e) + t,(e) - óA2(4,, (e) * c) 
p? (e) = H,(e) + £,(e) - 6A2(0,, (e) 4 c) - c - ó(1— 2A)(£,,,(e) +c) 
= H,(e)9- 4,(e) - c - (0, (e) c) 


when the low cost auditor is incumbent and 


p?! (e) = H,(e) -6AX(,, (e) +c) 
p (e) = H,(e) - Al (e) +c) -c - 6(1— 2A)(£,, (e) +c) 
= H,(e)—c—6(2,,,(e) +c) 


when the high cost auditor is incumbent. 


DIt is also possible for an expenditure to decrease V; for each auditor while increasing the learning cost of a new auditor. 
In that case, distortions will also occur but in the opposite direction as those indicated below. 
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The client's objective in choosing her investment in period t is to minimize the expected 
present value of total future expenditures. When the low cost auditor is the incumbent and the 
effects of the client's investment last only for one period (k=0), this amounts to minimizing the 
current period's total expenditures: 


Program 1 min pH(e)+e 
e 


which is equivalent to " : . 
min H* —w(e)t- t* — f(e) - c - ó(^* +e) +e. 
e 


Let the solution to this program be denoted e*. When the period tlow cost auditor is the rival, the 
client’s objective is: 


Program 2 min pi®(e)+e 
e 


which is equivalent to min H* -w(e) -G(t* — f(e) c) +e. 
e 


Denote the solution to Program 2 as e**. For an imperfectly competitive audit market, Program 
2 represents the expenditure choice in a switch period while Program 1 represents the expenditure 
in a non-switch period. Comparing e* and e** to the level of investment that would be chosen in 
the ex ante perfectly competitive audit market gives the following result: 


Proposition 3: When the benefits of the current period's expenditure only last through the 
current period, the client firm will underinvest in external audit substitutes in the 
imperfectly competitive audit market relative to the perfectly competitive audit market, 
but only in periods that the client firm changes auditors. 


In the perfectly competitive setting (where, without loss of generality but for comparability, 
assume all auditors have incremental cost H), Magee and Tseng (1990) show that 
p, = H+4+c-—6(£+c) in all but the first and last periods. As a result, when the period t low cost 
auditor is incumbent, the equilibrium audit price in the imperfectly competitive audit market, p, 
is identical to the equilibrium price in the perfectly competitive market, p. So the client's 
objectives, and therefore her decisions, are the same whether the audit market is perfectly or 
imperfectly competitive. But when the period ¢ rival auditor is the efficient provider, the 
equilibrium price, př, is independent of the current period's learning cost, so there is‘ no 
incentive for the client to lower that learning cost through her choice of e. This results in the 
underinvestment in periods of auditor switches. 

Next suppose that the benefit from the investment lasts more than one period (k>0), and that 
e is still a one-time expenditure made in period f. Again, the client must consider the effect of e 
on all affected expected future period fees (periods ¢ through t--k). The difference between the 
investment problem faced by a clientin the ex ante perfectly competitive audit market and a client 
intheimperfect audit market occurs only in periods of auditor switches. Recall that when the audit 
market is imperfectly competitive there is a A probability of a switch occurring each period in 
equilibrium. So while the client sees future audit fees of p,(= P) each period in the perfectly 
competitive market, with imperfect competition the expected audit fee in each of the future 
periods is (1— A)pE, + Ap% . Just as p} leads to lower investment in external audit substitutes 
than pL. when only one period is considered, the expected audit fee, being a convex combination 
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of Pi and pL , results in lower investment than pL, when the investment lasts for more than 
one period. Increasing the frequency of auditor switches moves the expected future audit fees 
away from PL, and toward Pri. 


Proposition 4: When the benefits of the client's investment last more than one period, the 
client firm will underinvest in external audit substitutes in the imperfect audit market 
relative to the ex ante perfect audit market. 


Proposition 5: The client's expenditure in external audit substitutes is decreasing in the 
probability of an audit switch. 


Since it is the possibility of not having to pay the learning costs to incumbent auditors in the 
form of higher fees that reduces the investment, spending on external audit substitutes decreases 
as the probability of an auditor switch increases. 


Management Advisory Services 


Synergy is frequently cited as the reason for firms hiring the same CPA firm to provide both 
the audit and MAS. Here we show why client firms may hire one CPA firm for the audit and 
another for MAS when the audit market is imperfectly competitive, even though there are 
technological advantages to using the same CPA firm for both services. We also show that when 
the client does hire her auditor to supply MAS, she does so even though the other auditor has 
submitted a lower bid for identical MAS. An MAS award of this nature might be viewed with 
suspicion, but these results are derived in a setting where the independence of the auditor is not 
in question, indicating that this type of pricing behavior does not necessarily imply a lack of 
auditor independence. 

We model the simultaneous operation of an audit and MAS market with the same suppliers 
to each market. In period t, the client can purchase a given quantity of MAS from either the auditor 
she hires in period t or the rival auditor. Let the cost of providing MAS be / for the client's period 
t auditor and R for the rival auditor. Synergy is represented by assuming that JSR. Most 
importantly, we assume that hiring the rival auditor to supply the period t MAS familiarizes him 
with the client's firm and thereby reduces his cost of learning the audit in period t+] by g, so that 
la 7 £' —g, where0«g < £". Purchasing MAS from the period t auditor does not reduce his 
period t+J learning cost because that auditor will have already become familiar with the client's 
firm through the audit engagement (the incumbent auditor has no learning cost in the next period). 
Notice that purchasing the audit and MAS from the same CPA firm provides a cost savings of R- 
I, but eliminates the potential opportunity for a reduction in the next period's learning cost. 
Without loss of generality to our results, H and L are assumed to be unaffected by the MAS 
engagement and will be held at a fixed level for all periods. 

Assume that the market for MAS is also characterized by (Bertrand) price competition and 
that neither the audit bids nor the MAS bids may be made contingent on the purchase of both the 
audit and MAS. (That is, firms are not allowed to “bundle” their services.) Then the only effect 
of the period t MAS engagement on the audit market is through /,,,. Once the MAS engagement 
is awarded, the value of Z,,, is determined and the audit prices follow from Lemma 3. 

From Lemma 3, the low cost auditor wins the audit engagement and charges p/ or p for 
the audit, depending on whether he is the incumbent or the rival. Furthermore, with probability 
1—À, he will charge pP, in period t+J as the low cost incumbent. The pricing equations indicate 
that same sourcing reduces p¥ and p^ by dg while it increases pH, by g and has no affect on 
pi. Therefore, considering only the audit market, the period t low cost auditor loses 
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Aóg —-óg-(1—À)óg in discounted expected audit fees as a consequence of winning the MAS 
engagement. As a result, the lowest profitable MAS bid the low cost auditor can make is /+Adg. 

The period : high cost auditor will win the period t--/ audit with probability A, in which case 
supplying the period t MAS reduces his audit learning costs by g in period t+ 1 but does not affect 
his audit fees (p/* is not affected by which firm has supplied the MAS). Consequently, 
considering only the audit market, the period t high cost auditor gains Aóg in expected audit 
market profit from winning the MAS engagement. As a result, the lowest profitable MAS bid the 
high cost auditor can make is R-Aóg. 

The client is assumed to minimize the total expected discounted costs incurred in both 
markets. Noting that the Aóg loss of expected audit fees to the low cost auditor represents a transfer 
of the same amount to the client, the client's discounted expected audit costs are Aóg lower if she 
hires the low cost auditor. This means that the high cost auditor must bid Aóg less on the MAS 
job than the low cost auditor in order to win it. It is this feature of our model that causes the client 
to occasionally accept an MAS bid even though the competing CPA firm has submitted a lower 
one. 

Under ex ante perfect competition, the incumbent auditor and his rival will both bid R for the 
period t MAS engagement with the incumbent winning the engagement whenever /<R. The audit 
fee is the same regardless of who supplies the MAS. Under imperfect competition in the audit 
market, as noted above, the period t audit fee is dg less when same sourcing is used than when the 
current auditor's rival provides MAS, with the MAS market having the following characteristics: 


Proposition 6: Assume imperfect competition in the audit market. 


(i) When R-J>A6g,the period t high cost auditor will bid R—Aóg for the MAS 
engagement, the period t low cost auditor will bid R for the MAS engagement, and the 
low cost auditor will win the MAS engagement. 


(ii) When R-I«Aóg, the period t high cost auditor will bid J+Adg for the MAS 
engagement, the period t low cost auditor will bid J for the MAS engagement, and the 
high cost auditor will win the MAS engagement. 


Proposition 6 implies that MAS is efficiently sourced when there is imperfect competition 
in the audit market because MAS is always purchased from the provider with the lower total 
expected cost (M.AS cost plus the effect of MAS on audit cost). The most interesting implication 
of Proposition 6 is that the same sourcing CPA firm obtains a premium of Aóg over what its 
competitor charges on the MAS engagements that it wins. The client is willing to pay this because 
if the other CPA firm were to supply MAS, the client’s current auditor will receive a higher audit 
fee. In the absence of other evidence, such premia might call into question the independence of 
the auditor’s relationship with his client. This paper suggests however, that in imperfectly 
competitive audit markets, a premium paid by a client for MAS work done by her auditor may 
be restitution for the additional audit market competition created by same sourcing. A necessary 
condition for any of these interesting pricing characteristics, however, is that there be some form 
of ex ante imperfect competition. 

It is a commonly held notion" that auditing has become a relatively competitive, unprofit- 
able, and mature industry, and that instead of going out of the auditing business, many CPA firms 
use the audit as a price leader to market their (assumed) more profitable MAS services. These 


See, for example, Leibman and Kelly (1992, 416-417) for this precise argument and other references. 
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results suggest an alternative explanation for audit fees which are apparently lower when the 
client purchases MAS from her auditor. 


Independent Pricing Policies 


This section examines the effects of independent pricing policies whereby a CPA firm 
instructs each of its MAS and audit divisions to ignore the impact of their activities on the 
profitability of the other division. We continue to assume however, that the independent pricing 
policy does not prevent technological spillovers between the divisions so that as before, ISR. This 
is consistent with a CPA firm that treats both its audit and MAS division as profit centers but 
encourages the free flow of information between the divisions. We assume that both CPA firms 
have independent pricing policies in effect. 

It was shown above that in the absence of independent pricing policies, the CPA firm not 
conducting the audit would reduce its MAS bid by the difference in expected audit costs in attempt 
to win the MAS engagement, financing the lower bid through the expected reduction in next 
period's audit costs. However, with an independent pricing policy, the MAS divisions are told to 
ignore the impact of the MAS engagement on the audit division, so the rival CPA firm will never 
lower its bid below R. Since the incumbent auditor' s MAS division has a lower cost of conducting 
the MAS, he can always bid to win the MAS engagement, leading to our final result: 


Proposition 7: Suppose that CPA firms use independent pricing policies and that an 
incumbent auditor has a cost advantage in supplying MAS. Then a client in an 
imperfectly competitive audit market always hires the same firm for MAS and audit. 


Note that because same sourcing is used when R—I«Aóg, the client no longer efficiently 
sources her MAS. This inefficiency occurs because the independent pricing policy prevents the 
CPA firm conducting the audit from making the client bear its full costs through the MAS fees 
andit also prevents the other CPA firm from lowering its bid in accordance with its expected audit 
cost savings. Consequently, in the imperfectly competitive audit market, the independent pricing 
policy can lead to underpricing in the MAS market by the auditor and overpricing in the MAS 
market by the other CPA firm, which in turn can lead to the inefficient sourcing of MAS. In 
contrast, for the perfectly competitive audit market, the independent pricing policy 1s inconse- 
quential since the MAS engagement does not affect the audit costs (i.e., switching never occurs). 


VI. DISCUSSION AND SUMMARY 


This paper has demonstrated that when cost differences among CPA firms serve as a source 
of economic rents to the incumbent auditor, the switching costs previously cited as the source of 
the auditors' rents may actually serve to transfer rents from the auditor back to the client. 
Obtaining this result required certain simplifying assumptions. In particular, it was assumed that 
at any point in time (a) the number of CPA firms equals two, (b) one of the CPA firms has a strict 
competitive cost advantage in auditing the client where the differential for the current period 
exceeds the combined switch costs of client and competing CPA firm, (c) the identity of the CPA 
firm with the cost advantage changes randomly, and (d) the market knows the nature of the cost 
advantage before bids are made and accepted. 

An obvious drawback of this model is that it does not permit an explicit examination of other 
important issues such as auditor independence or the role of asymmetric information in audit 
markets.’ A distinct advantage of the model, however, is that in addition to deriving the main 


Note however, that Johnson and Lys (1990) argue that they find no evidence to suggest that, on average, opinion 
shopping is an important determinant of auditor switches. 
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result above, it also permitted us to derive implications of this result concerning how switching 
costs affect the structure of audit engagements and how clients invest in their relationships with 
auditors and in audit-related services. While the resulting client behavior might appear inefficient 
or suspicious to outsiders, it is by definition the natural consequence of imperfect competition. 
This behavior includes client under-investment in accounting systems, clients accepting their 
current auditor’ s MAS bid even though a competitor has submitted a lower bid for identical MAS, 
and inefficient same-sourcing for MAS and audit services for CPA firms who treat their audit and 
non-audit divisions as separate profit centers. 


APPENDIX 


Proof of Lemma I: Price competition when producers have different costs results in equilibrium 

bids with the following two properties in each period: 

(1) at least one auditor’s bid will be driven down to the point where he is indifferent between 
winning and losing the current period’s audit and 

(2) the other auditor will win the engagement with a bid that (in the limit) makes the client 
indifferent between auditors. 

(See for example Tirole, 1988; and Magee and Tseng, 1990.) First consider the case where the 

period N low cost auditor is incumbent. Property (1) implies either (i) pf = H + / or (ii) pH =L. 

Property (2) and (i) imply p =H+4+c while (2) and (ii) imply p# =L—c. But since 

p% = L—c is a negative expected profit bid it must be that: 


pP -Hm-1, 
pl H«c 


and the client hires the low cost auditor. 

When the period N low cost auditor is the rival, property (1) implies either (17) p = H or 
(ii^) pH = L+ Z. Property (2) and (i) imply pẹ = H—c while (2) and (ii) imply pg — L4- 44 c. 
But H- L > +c implies pF = L+ £+ cisanegative expected profit bid. Therefore, it must be 
that: 


Py = A, 
py =H-c 


and the client hires the low cost auditor. // 


Proof of Lemma 2: First consider the case where the period N—/ low cost auditor is incumbent. 
Again, using the properties of equilibrium listed above and noting each auditors expected value 
of incumbency, (1) implies either 


(i) pE&, = H+ L— 6A2(£ c) or (ii) pL, = L- 6(1— A)2(£- c). Property (2) and (i) imply: 
H+- 6A2(£+c)+ c+ d[Api + (1- A pF  c)] 
= pH + ó[(1— A)pE + Mp? + c)] 
= p= H+ (1—-d)(t+ c). 


Property (2) and (ii) imply: 
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L—-6(1—- A)2(1 4 c)  él(1- A)pH + ACE? +c)] 
= pik, +c+d[Apy t (1—- AY(( py t c)] 
€» pHR = L-c—- ó(t« c). 


But p#, = L-c—6(4+c) is a negative expected profit bid, so it must be that: 


pæ, = H+1-dA2%4+c) 
pi, =H+(1-8d)(£+c). 


When the period N—J low cost auditor is the rival, (1) implies either 
(i) pZ, = H- 6A2(£+ c) or (G) pf, = L+ £—- 6(1— A)2(£+ c). Property (2) and (1) imply: 


H —6A2(0- c) - 98[Apl +(1-A)(pB +c)] 
= pi * ct ó[(1— A)pI * AM ph 4 c)] 
€» pik = H-c-ó(t*c). 


Property (2) and (ii^) imply: 


Lt £—-ó(1— A(4* c) ct ôi- A)pE + Mp? + c)] 


= py, + O[Apy + (1— AN pH + c)] 
e pH = L+tl+c-d(t+c). 


But pı = L+t+c-—06(£+c) is a negative expected profit bid, so it must be that: 


pH, = H-E +e) 
pit, =H-d(f£+c)-c.// 


Proof of Proposition I: We take the results of Lemma 2 as the first step in a formal proof by 
induction and hypothesize for the purposes of our proof that the period N—/ equilibrium strategies 
apply for all periods n such that t<n<N. We now show that, given our hypothesis, these same 
strategies are equilibrium strategies in period t. First consider the case of a low cost incumbent 
competing with the high cost rival in period t. Define H; as the discounted expected 
equilibrium profits starting from period ¢ for a Low cost Incumbent auditor who Wins the period 
t audit. Then: 


IIE* = pP - L- 6[((1— A HEY +A TT J. 


př —- L is the profit earned in period t by a low cost incumbent who wins the audit with a bid of 
p” and ó[(1— A) I1 27 -- A TT # ] is the discounted expected continuation value starting in period 
t+] for a low cost incumbent who wins the audit in period t because (/—A) is the probability of 
remaining the low cost auditor and winning the audit in period t--/ while A is the probability of 
becoming high cost and losing the audit. A low cost incumbent auditor who loses the period t audit 
has discounted expected equilibrium profits starting from period t of: 


II =0+6[(1-A) ITE +A []E® ]. 


332 The Accounting Review, April 1995 


Therefore, the bid by a low cost incumbent auditor which satisfies property (1) is defined by . 
E — TI»! =0: 
pP -L+6[(l- a) IRF +A NIZ] -81-A ITY +A ITER- ] = 0 


€» pP = L-ó[(1— AYIIUY -IR ) + AIEI — TTER-)]. 


Solving for the expected profit terms and their differences on the right hand side of this expression 
gives: 
La = pay -L-6[(U-AMIZT tA I 
EV = pR -L—4-Ó[(1- A) TT EY +A ME] 
BE = O+ 5[(1—A) IT +A TTR" 


[TPR = 04 6[(1— A) [TT +A [TE 


and therefore, 
IZ? — TTY = pai — Bal + f= WL +c) 
LFF -IIE =0. 


Substituting back into the expression for the minimum pP gives: 
p” -L-ó(1—A)2(t*c). 
Similarly, the high cost rival has a discounted expected payoff if he wins the period t audit of: 


[PF = pFF -H £4 6[(1— A) IE. A TTEV 
and if he loses, 


ITE = 04 6[(1— A) [TBR +A TT XY ]. 
So the bid by the high cost rival which satisfies property (1) is defined by: 
pER -H—L+60[(1—A) I] -ATIEF ] -ó[(1— A) ITE- +A [T7 ] 2 0 
€» pf! = He t-ó[(1— A) ITEP -I + ACITUY — TRY). 
Substituting for the period t+/ continuation values (solved above) gives: 
pik = H+tl~—6dA2(L+c). 
These indifference bids are exactly the same as those in the proof of Lemma 2. Therefore the same 
proof applies here, giving the period f equilibrium strategies of: 
pF = H + £~-6A2(£+c) 
pP =H+(1—68)(t£+c) 
when the low cost auditor is incumbent, and the client hires the low cost auditor. 
The case where the high cost auditor is incumbent follows in the same way, giving period t 
equilibrium strategies of: 
pH = H-6A2(£+¢) 
pR=H-d(t+c)-c 
and the client hires the low cost auditor. / 
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_Proof of Proposition 2: The present value of expected total audit costs starting in period 1 is: 


N 
pr * 36 (1—-A)pP + AP +c)) 
in] 
which is equivalent to: 
N-I 
H+l-OAAl+c)+ X 6 (H+ (1-6)49c)- A8 c) 6" ((1- A(H £4 c)- AH) 
f=] 
N 
=H+(1-8)(£+c)—c+ X 5(H+(1-8)(L+c)-Ae+c)) 4 "(09 c). 
i=] 


As N-> os this approaches: 


H+(1-6d})f+c)-—ct+(l/r)(H+(1-6}44+¢)-A(L+c)) 
= HH1-6)+£—-(A/r)(4 +c). 


By inspection, this expression is decreasing in c and A and decreasing in / if and only if A/r»1. // 


Proof of Lemma 3: 'The proof follows the proof of Proposition 1 identically, but with period 
subscripts. / 


Proof of Proposition 3: From the necessary and sufficient first order conditions w (e*)+f(e*)=1 
and w (e**)=J. Since f(e)>0, w (e**)>w (e*). And finally w“(e)<0 implies e*>e**, // 


Proof of Proposition 4: The programming problems for the optimal choice of e in the case of 
perfect competition, imperfect competition with a low cost incumbent and imperfect competition 


with a high cost incumbent are, respectively: 
Program 3 (ex ante perfect competition) 


min e+ ) pH (e). 
e md 
Program 4 (imperfect competition, low cost incumbent) 

min e+ pH (e)+ Y [(1— A)pll(e) + Miele) c). 
Program 5 (imperfect competition, high cost incumbent) 

min es pit(e)+o+ X9 - ple) Mile) o). 


Call the respective solutions to these programs e;, ej and e; . Substituting from Lemma 3 for 
the áppropriate prices and detiving the first order conditions results in the following expressions: 


k k k-1 
fet Mo w'(e;) - 1, f'(e )- M ów'(er) 2 1-AY Sf (ef) 
in in imO 


k k-1 
and 2, ó'w'(et")- Ao, à f'(er*)-1 
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when the second order conditions are satisfied. (The second order condition is satisfied in the 
perfect audit market, but may not be in the imperfect audit market. Those cases will be addressed 


below.) Therefore: k l 
f'(ep)* M w'(ter) > f'teg) + M Swe) 
i-0 im 


k k 
and 2, 0 (ej oc 2 Owen 


1-0 
k 
Since fe) and Ý ó'w(e) are concave increasing in e, e; > ef" and e; > ep. 


In 
Since w(e)—L* and fle) > f° as e—»oe, total audit expenditures go to infinity as e gets arbitrarily 
large. And since total expenditures are finite when e=0, it must be that e;,' = 0 and e," = 0 when 
their respective second order conditions for a minimum are nowhere satisfied. // 


Proof of Proposition 5: From above, er =0 ande," =0 if their respective second order 
conditions are nowhere satisfied. Since the second order conditions are decreasing in A, if a second 
order condition is nowhere satisfied for some À", then it is nowhere satisfied for any A24". So, if 
e, — 0, orrespectively ej" — 0, at some value of A, it must also be zero for all higher values 
of A. 

When its second order condition is satisfied, e," may be defined as an implicit function of J, 
e; (A). Then differentiating the second order condition with respect to / gives: 





k—1 
der'(À) ] 2, OF (e, (A)) 
k k-1 e 
dÀ E Siwe (a) f" (e (1)- A 8 f" (er (À)) 
im) imG 


The numerator of the right hand side is positive, while the denominator is negative whenever the 
second order condition is satisfied. Therefore e,'(A) is decreasing in A. 

Likewise, when its second order condition is satisfied, e;" may be defined as an implicit 
function of A, e;" (A). Then differentiating the second order condition with respect to A gives: 


k-1 


dA k k-i z 
2,8 w"(er" (A)) - A M 0! f"(ei" (A)) 
i=0 i=l 





And again, the numerator of the right hand side is positive, while the denominator is negative 
whenever the second order condition is satisfied. Therefore ef” ( 4) is decreasing in A, completing 
the proof. // 


Proof of Proposition 6: 'The difference between the expected audit profits to a low cost auditor 
who wins the MAS engagement and a low cost auditor who loses the MAS engagement is —Adg. 
Therefore, a low cost auditor’s lowest possible bid for MAS is /+Aédg. Similarly, the difference 
between the expected audit profits to a high cost auditor who wins the MAS engagement and a 
high cost auditor who loses the MAS engagement is Aóg. So the lowest possible MAS bid for a 
high cost auditoris R-Adg. The difference in discounted expected audit costs for aclient who hires 
the auditor to provide MAS and one who hires the other CPA firm for MAS is -Aóg. Therefore, 
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if the other CPA firm is to win the MAS engagement, he must bid Aóg less than the auditor bids 
for MAS. Consequently, the rival CPA firm can underbid the auditor for MAS if and only if: 


R — Adg + Adg < I + Adg 
€» R- I < Aóg. 


When the high cost auditor can underbid the low cost auditor for the MAS, the low cost auditor 
bids I--Aóg and to make the client indifferent, the high cost auditor bids I--A46g—A62 —I. When the 
high cost auditor can not underbid the low cost auditor on the MAS, he must bid R-Aóg, and to 
make the client indifferent, the low cost auditor bids R-Adg+Adg=R for the MAS. / 


Proof of Proposition 7: As before, the difference in discounted expected audit costs for a client 
who hires the auditor to provide MAS and one who hires the other CPA firm for MAS is —Aóg. 
Therefore, if the other CPA firm is to win the MAS engagement, he must bid Aóg less than the 
auditor bids for MAS. But for the MAS market in isolation, the lowest possible bid for the other 
CPA firm is R. This means the incumbent auditor’s CPA firm can bid R-- Aóg and win the MAS 
engagement. Since that firm's MAS division only considers its own division profit, it will gladly 
make this bid because R+Adg>I. // 
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ABSTRACT: Although costs of default underpin the debt covenant hypothesis, prior 
research provides limited evidence- of their nature, magnitude, and impact on 
shareholder wealth. We show that announcements of technical default are associ- 
ated with significant stock price declines. Combining post-default changes in terms 
of debt contracts with stock retums, we examine whether the consequences arising 
from renegotiation of lending agreements are priced in the market, and estimate that 
higher costs of borrowing and new restrictions on firms' opportunities impose wealth 
losses of 1.4% on shareholders. Leverage measures, frequently used in accounting 
research as proxies for economic effects of debt contracts, are found to be poor 
surrogates for default or renegotlation costs. 


Key Words: Technical default, Debt covenant violation, Renegotiation, Leverage, 
Financial distress. 


Data Avallability: A /ist of sample firms is avallable from either author. 


I. INTRODUCTION 


CCOUNTING researchers have long maintained that technical default on covenants in 
debt agreements is costly and negatively impacts shareholder wealth. However, the 
findings on wealth effects are mixed, and prior research provides little evidence of 
default costs. For example, Frost and Bernard (1989, 789) write that "evidence of economic 
consequences operating through debt covenants has usually been weak." Furthermore, though 
previous research alludes to technical default costs, Watts and Zimmerman (1990, 151) note 
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“researchers have been unable to document the magnitude of the costs imposed by technical 
violation of a debt covenant or the magnitude of renegotiation costs.” In this paper, we investigate | 
whether technical default impacts shareholder wealth, and estimate default and renegotiation | 
costs by combining the outcomes of debt contract renegotiation with an analysis of stock prices. 

Recent research by Beneish and Press (1993), DeFond and Jiambalvo (1994), and Sweeney 
(1994) documents effects from resolving technical default, but yields only limited evidence on 
costs to shareholders. What these authors observe is that renegotiation following technical default 
entails three principal changes in debt agreement terms: (1) additional covenants, (ii) increased 
interest rates, and (iii) reduced allowable borrowings. Evidence of incremental covenants—such 
as new borrowing restrictions and limitations on capital expenditures—is interpreted by Defond 
and Jiambalvo and by Sweeney as default costs. Beneish and Press (1993) treat these outcomes 
as consequences of technical default because it is unclear that new covenants per se impose costs. 

On one hand, for added covenants to be costly, they must be binding. On the other hand, if 
added covenants give lenders increased control, potential value-preserving benefits can arise 
(Wruck 1990). That is, from a shareholder perspective it may be optimal to reduce managerial 
discretion within technical default firms. Thus, it is an open question whether the potential 
benefits from increased lender oversight exceed the costs of a shrunken opportunity set. Similarly, 
while post-default interest rate changes and reductions in maximum allowable borrowing are 
identified by Beneish and Press, DeFond and Jiambalvo, and Sweeney, none provides evidence 
of whether these affect stock prices, and only one type of cost (refinancing) is estimated by 
Beneish and Press. Difficulty in obtaining and analyzing debt agreements has been an obstacle 
in assessing default Costs directly. Therefore, we focus on assessments of post-default changes 
in debt agreements and stock prices. 

We establish that defaults on debt covenants are associated with significant shareholder 
wealth losses. In the three-day period surrounding announcements, the average abnormal return 
in our sample of 87 first-time disclosures of technical default is -3.5296. We find that some of the 
consequences of technical default are 1mpounded in stock price. New financing and investing 
constraints imposed by lenders—-a subset of which is binding—are costly, suggesting that the 
costs of shrinking opportunities exceed the benefits of increased lender monitoring. Higher 
financing costs resulting from renegotiation also have an adverse impact on shareholder wealth. 
We find no relation between stock prices and reductions in allowable borrowing, a result 
attributed to loan reductions in our sample eliminating borrowing slack without requiring large 
repayments. l 

We also investigate leverage as an explanatory variable of the abnormal returns around 
technical default announcements, since it has been widely used in previous research. Because 
capital structure varies across firms given different investment opportunities (Smith 1993), we 
also testa change in leverage variable. We find that neither leverage levels nor changes in leverage 
proxy for default costs. However, leverage changes are associated with the valuation effects of 
technical default, suggesting they may be a signal about cash flow realizations subsequent to 
default. We conclude that tests of debt covenant effects are better specified using debt contract 
data. 

The next section develops hypotheses about the stock market effects of technical default. We 
review the procedures used to obtain our sample in section three. In the fourth and fifth sections, 
we describe how we measure abnormal performance and present evidence of shareholder wealth 
losses around technical default announcements. The losses are related to effects of the resolution 
of technical default. We also assess whether using data drawn from debt contracts enhances the 
specification of tests of the debt covenant hypothesis. In the last section, we provide a summary 
and conclusions. 
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Il. HYPOTHESIS DEVELOPMENT 


- Analyzing how the stock market responds to an event is a widely-used method of measuring 
the impact of economic events that has not been applied to estimating the magnitude of financial 
distress costs. One difficulty in its application is an identification problem: stock price responses 
to announcements of events of financial distress reflect both the costs of distress and a signal about 
firms’ expected future cash flows. Further, the direction of the signal about firms’ future cash 
flows is ambiguous. We use data drawn from debt contracts and control for contemporaneous 
information releases to address these complications. 

The first hypothesis concerns the wealth impact of technical default announcements.’ On one 
hand, negative effects of unanticipated technical default announcements stem from at least three 
sources: the shrinkage in the investment opportunity set associated with the imposition of tighter 
or supplementary constraints, or through forced sales of collateral assets; other explicit contract- 
ing costs, such as refinancing at higher rates or accelerating loan maturities, which entails 
repayments; and the announcement itself as a signal about lower future cash flows from 
continuing operations. On the other hand, if technical default is a consequence of bad manage- 
ment rather than exogenous factors, lenders’ increased control can thwart or reverse managers’ 
dissipation of firm assets. The possibility of improved efficiency arising from greater lender 
control might, over time, shift firm cash flows upward. Thus, our first hypothesis test is two-tailed 
because of potentially opposing effects of technical default. The hypothesis, stated in alternative 
form, 1S: 


H,: Announcements of technical default impact shareholders’ wealth. 


Similar to previous researchers, we assume that investors form expectations about default 
costs and impound them in stock prices at the time of technical default announcement. But rather 
than use leverage as a proxy for default costs, we make predictions about cross-sectional variation 
in the wealth effects of technical default announcements as a function of consequences arising 
from the resolution of the event of default? Using changes in terms of debt agreements, we 
identify three consequences of technical default: incremental financing costs, reductions in 
allowable borrowing, and additional covenants. In the second hypothesis, we evaluate whether 
increased borrowing costs following renegotiation are negatively associated with stock price 
reactions. In hypothesis 3, we test whether repayment demands resulting from reductions in 
allowable borrowing (that can limit firms’ investment opportunities) reduce firm value. 


Hy: ‘The stock price reaction to announcements of technical default is negatively related to 
the incremental financing costs arising from renegotiating terms of debt agreements. 


FL: The stock price reaction to announcements of technical default is negatively related to 
reductions in allowable borrowing. 


In hypothesis four, we evaluate the impact of added covenants on firm value. The hypothesis is 
two-tailed because new covenants restrict opportunities, but also increase lender control and 
potentially help preserve value. 


! We define technical default as the violation of accounting-based covenants in lending agreements. As such, we do not 
regard firms that default on debt service as technical defaulters. 

2 See, for example, Bowen et al. (1981), Collins et al. (1981), Holthausen (1981), Lilien and Pastena (1982), Daley and 
Vigeland (1983), Johnson and Ramanan (1988), Trombiey (1989), and Chen and Wei (1993). 
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H,: The stock price reaction to announcements of technical default is affected by the 
imposition of additional covenants. 


Our fifth hypothesis evaluates the relation between leverage and stock prices. Given the high 
cost of obtaining and analyzing debt agreements, prior research has used leverage to proxy for 
expected costs of default. The argument for assuming leverage is correlated with default costs is 
well-established: as the relative level of debt rises, a firm is more likely to have tighter constraints 
in its debt agreements to protect creditors, which increases the likelihood of bearing costs of 
covenant non-compliance. Thus, the hypothesis is: 


H,: The stock price response at the time of technical default announcements is negatively 
related to leverage. 


III. SAMPLE SELECTION 


Identification 


There is no systematic public record of technical default. However, financial statements 
disclose defaults because SEC Regulation S-X (§210.4—08) requires that “any breach of covenant 
of a[n]...indenture or agreement, which... exist[s] at the date ofthe most recent balance sheet being 
filed and which has not been subsequently cured, shall be stated in the notes to the financial 
statements" (SEC 1988)? Our sample is based on the firms in technical default studied in Beneish 
and Press (1993). They used Compact Disclosure, the National Automated Accounting Research 
System, and the Dow Jones News Service to search financial statements for keywords identifying 
debt covenant violations, and identified 202 cases of potential default during fiscal years ending 
between 1983 and 1987. Their sample was pared to eliminate: 


a. firms that were notin compliance with environmental regulations or financial institutions 
not in compliance with reserve requirements, 

b. firms that miss an interest or principal payment, or seek protection from creditors under 
the Federal Bankruptcy Act, 

c. firms that appeared in the sample more than once, 

d. firms that violated outside the period 1983 to 1987, 

e. firms for which Forms 10-K were unavailable in their university library collections. 


Beneish and Press (1993) derived a sample of 91 firms. We eliminate four additional firms by 
requiring that security returns be available on the CRSP Daily Returns tape on the day of, and for 
161 days after, the announcement of technical default. 

For the 87 firms included in the sample (42 New York and 45 American Stock Exchange 
firms), the initial year of violation is labelled as Year 0. Over the period 1983—1987, technical 
default occurs among 4.4% of all firms in the 32 different two-digit SIC industries in our sample. 


* There are also three accounting pronouncements that indirectly require default disclosure. Financial Accounting 
Standard No. 78 (1983) and Emerging Issues Task Force Release 86-30 (1986) dictate disclosure of the circumstances 
of a default when long-term debt is reclassified as a current liability. SAS No. 59 mandates that lack of compliance with 
covenants is a basis for auditors disclosing going concern problems. Because of these rules, financial statements reflect 
the occurrence of material, uncured debt covenant violations. 
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SIC. codes 10—19—extraction and construction companies (7.1%)—and SIC codes 30-39— 
durable product manufacturers (6.796)—have higher than average rates of technical default, but 
no single industry group predominates in the sample. 

Using Wilcoxon rank-sum tests, Beneish and Press (1993) compared the distribution of size, 
leverage, liquidity and profitability measures of firms in technical defaultto all non-violator firms 
available on the Compustat Primary, Supplementary, and Tertiary annual file and the Full 
Coverage file in the two-digit SIC industries to which violators belong. Statistically significant 
differences obtain; firms which experience technical default are smaller, more levered, less liquid 
and less profitable than non-violators. 


What Leads to Technical Default? 


Beneish and Press (1993) present evidence on the causes of technical default. After 
examining whether voluntary accounting changes or compliance with mandatory policies force 
firms into default, they conclude "that technical violations are induced by financial distress rather 
than accounting changes" (p. 243). Papers that employ debt covenant effects in explaining stock 
price responses to accounting changes have hypothesized and, in few instances, documénted 
reductions of covenant slack. Yet with the exception of Frost and Bernard's (1989) study of 18 
firms forced to comply with an SEC rule change on reserve recognition, no paper reports technical 
default caused by accounting changes. While Frost and Bernard determine that two of 18 firms 
are in technical default, one was already in technical default in the year prior to the mandated 
change (Beneish and Press 1993). For the second firm, itis not possible to determine whether the 
SEC ruling or existing poor financial condition caused the default. 

Itis also unlikely that accrual usage causes technical default. DeFond and Jiambalvo (1994) 
present evidence that firms attempt to avoid default by increasing accruals in the year prior to, and 
to a lesser extent, in the year of technical default. When they examine a sub-sample for which 
identifiable debt covenant constraints are available, they find that the accrual amounts for these 
firms were insufficient either to avoid or cause technical default. They suggest this “indicates that 
the violating firms would find it difficult to manipulate to an extent that would avoid violation" 
(p. 173). 

Thus, neither accounting policy changes nor accrual behavior causes technical default. The 
balance sheet and profitability characteristics of defaulters noted above suggest that financial 
distress is a more likely cause. Nonetheless, technical default is of interest to accounting 
researchers because it is a setting where covenant default costs can be identified and their impact 
on shareholder wealth assessed.^ 


Disclosure of Technical Default 


Disclosures of technical default occur in news media stories, filings made to the Securities 
and Exchange Commission (Forms 10-K or NT10-Ks), annual reports to shareholders and 
Moody's bond manuals. For each firm in the sample, we collected the following data pertaining 
to Year 0 (year of initial violation): 


(a) The fiscal year-end, taken from a microfiche copy of the Form 10-K. 
(b) The date the SEC received the Form 10-K, from the WORKLOAD computer listing of 
public filings available in the SEC Public Reference Room in Washington, D.C. 


* Technical default is also important because it is an event associated with increased likelihood of more serious distress, 
such as debt service default and bankruptcy (Bencish and Press 1995). 
5 A Form NT10-K (non-timely 10-K) filing explains why a firm cannot meet its Form 10-K filing date. 
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(c) The date the SEC received an NT10-K, if a firm filed one. j 

(d) The date the SEC received the annual report, taken from WORKLOAD., , . 

(e) All stories in The Wall Street Journal from the beginning of the year prior to Year 0 to 
"nine months after Year 0 for which reading The Wall Street Journal Index suggested the 
story may disclose technical default. 

(f) All stories in the Dow Jones News Service (DINS) from June 1979 to July 1989 obtained 
from a keyword search on “violation,” “waiver,” “covenant,” "compliance" and “de- 
fault.” The DJNS is a computerized data base that offers text coverage of the Dow Jones 
Broad Tape, The Wall Street Journal, and Barron’s. 

(g) Dates of technical default appearing in Moody’s manuals’ descriptions of firms’ 
outstanding debt. 

(h) The four reports preceding technical default for each of the 40 sample firms covered in 
Value Line. 


Using these data, we determine the day technical default was first disclosed (Day 0). For those 
cases where a Form 10-K or annual report reveal violations, we collate the news media stories’ 
dates with the Form 10-K, annual report, NT10-K and Moody's dates in Year 0 to obtain Day 0. 
News media supply Day 0 in 33 instances, Forms 10-K in 52 cases, NT10-Ks provide two, and 
Moody's, Value Line and annual reports supply none.? 

Technical default is seldom disclosed alone.’ Over two-thirds of the violation announce- 
ments are released concurrently with information about earnings (33 firms), waivers from lenders 
(44 firms), and auditor decisions (13 audit qualifications, 15 loan reclassifications). We also find 
five instances of staff reductions and three cases of asset sales. These phenomena are likely to be 
part of the same economic event, but it is unlikely that they are informationally perfect substitutes. 

If disclosures other than technical default are partially complementary signals, excluding them 
- from a cross-sectional model explaining security returns could create an omitted variable 
— problem. Therefore, we control for contemporaneous confounding events in assessing whether 
consequences of technical default are impounded in stock prices. 


IV. MEASUREMENT OF ABNORMAL RETURNS 


We estimate daily prediction errors PE, for each sample firm i on each event day t using the 
market model. Market model parameters are estimated over 300 trading days from day +61 to day 
+360, using the equally-weighted New York and American Stock Exchange index on event day t. 





$ Forms 10-Q could reveal technical default either in Part II, Defaults upon Senior Securities, or in debt footnotes, 
preempting the annual Form 10-K event date. To ascertain whether a 10-Q precedes the 10-K in revealing the violation, 
we purchased all three 10-Qs filed with the SEC during the year of violation from Disclosure, Inc. Since it is costly to 
purchase all these reports, we randomly chose 10 out of the 52 sample firms with 10-K event dates. 

Under the assumption that the Form 10-K event date is not the first revelation in the year of violation and that the 

Form 10-Q is where the incident is disclosed, there is a one in three chance of observing a violation in a 10-Q. We read 
the 30 10-Qs (10 random firms x 3 quarters) and found no report of technical violation in 10-Qs. If indeed there is a .33 
chance of observing violation disclosure in any 10-Q, the probability of observing none in 30 is less than .00001. Even 
assuming that there Is only one chance inten that a 10-Q is where the default is first reported, the likelihood of observing 
none is still less than .05. We also purchased every Form 8-K filed during year 0 for the ten randomly-picked firms. A 
Form 8-K might contain a technical default as an unscheduled material event. None of the 8-Ks reveal violation. We 
do not find these results surprising because covenant compliance is typically evaluated using annual audited financial 
statements. 
We identify contemporaneous information releases by reference to two sources. We regard as concurrent a story 
appearing in The Wall Street Journal Index ot Dow Jones News Service during the five trading days —2 to +2 relative 
to day 0. Second, in the case of Forms 10-K or NT10-K disclosure, a default announcement is contaminated if earnings, 
audit qualifications, or loan reclassifications have not been previously announced. 


-L 


Beneish and Press—The Resolution of Technical Default 343 


A post-event estimation period is appropriate if firms experience abnormal returns or risk shifts 
immediately preceding the default announcement (see, for example, Dopuch et al. 1986 on audit 
qualifications, or Holthausen and Leftwich 1986 on bond rating changes).* 

The prediction errors are averaged across N sample firms on each day t to form an average 
prediction error, APE; these are cumulated over intervals of k days from t through t+k to obtain 
cumulative average prediction errors, CAPE, ex The t-statistic used to test whether CAPE differ 
pigu andy from zero is based on the time-series variance of portfolio average prediction errors 

S? pg» for the 100 days from day —61 to day —160, and incorporates any cross-sectional dependence 
in the daily prediction errors. ` 

The estimate of the portfolio time-series variance for the test statistic could be sensitive to the 
. estimation period chosen. The variance estimated over days +61 to +161 is 20.3% higher than the. 
corresponding estimate for days —161 to —61. The t-statistics are approximately nine percent 
lower than those that obtain using the pre-announcement estimation period, but this reduction 
does little to alter the reported significance of estimates of abnormal performance. Consistent with 
Holthausen and Leftwich (1986), we choose the post-announcement estimate as it produces more 
conservative t-statistics.? 


V. STOCK MARKET EVIDENCE 


Abnormal Returns Around Default Announcements 


One measure of the economic consequences of technical default is the stock market reaction 
to default announcements. Table 1 presents cumulative average prediction errors over various 
intervals beginning 300 trading days before the technical default announcement and ending 60 
trading days after the event. Two features of the table are noteworthy. First, poor stock market 
performance precedes technical default. Technical defaulters’ CAPE from —300 to —61 is . 
—14.43%, significantly different from zero at the 10 percent level. In comparison, other events of 
financial distress are preceded by greater losses in shareholder wealth. Returns of firms which file 
for bankruptcy under Chapter XI decrease 43 percent in Warner's (1977) sample, and 74 percent 
in Clark and Weinstein's (1983) in the 12 and three-month periods, respectively, prior to filing. 
In Gilson et al. (1990), firms in which cash flow deteriorates sufficiently to preclude debt service 
lose about 45 percent of market value of equity (on a market-adjusted basis) in the year prior to 
payment default. 

Second, the stock price impact of technical default announcements is negative. As table 1, 
panels A and B show, the mean CAPE from days —1 to +1 is —3.5296, a wealth loss significant at 
the 5 percent level. The median CAPE is —1.49%, and the range is -48.83% to 27.84%. The 


* Announcement period results are insensitive to the choice of equal- or value-weighted indices. However, the 
announcement period results could be sensitive to the choice of estimation period. Thus, we also use a pre-event 
estimation period from days —300 to —60 and obtain similar results. This is because systernatic risk and return variance 
are not affected by the technical default announcements. The evidence indicates there are no significant changes in 
systematic risk (B) pre- and post-default. Mean and median Bs estimated on days +61 to +300 are 1.22 and 1.17. While 
the post-default estimates are higher than pre-default Bin the period —300 to —61, mean and median Bs are 1.10 and 
1.00—Wilcoxon rank-sum tests cannot reject the hypothesis that the Bs are drawn from the same distribution (H, B e 
= B ec: Similar results obtain for a test comparing return variances pre-and post-default. 

? Because the estimation period for some 1986 and 1987 sample firms includes the October 1987 market crash, we modify 
the variance estimation by excluding the three trading days from October 16 to October 20, 1987. The exclusion alters 
neither the estimated average abnormal performance nor its statistical significance. It results, on average, in a three 
percent decrease in variance. We report results including the three trading days. We also assess the significance of the 
ici abnormal performance using a standardized test statistic described in Dopuch et al. (1986) and obtain similar 
results. 
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TABLE 1 
Percentage Cumulative Average Prediction Errors (CAPE) and t-statistics for the 
Sample of 87 Announcements of Technical Violation from 1983 to 1987. 


Panel A: Percentage Cumulative Average Prediction Errors 


Days relative # of days in Fo t- 

to event day? cumulation CAPE, statistic? 

—300, -61 240 —14.43 —1.79 

—-60, —2 59 -4.93 —1.23 

-60, -31 30 —1.09 —-0.38 

-30, -11 20 | -—2.05 -0.88 

-10 l -0.75 —1.44 

-9 1 0.23 0.44 

-$ 1 0.83 1.58 

-7 1 -0.33 —0.63 

-6 1 —0.20 —0.39 

-5 l -0.26 -0.49 

4 1 -0.26 -0.49 

-3 1 -0.03 -0.06 

-2 l -1.09 -2.08 

-] l -1.25 —2.40 

0 1 —2.08 —3.99 

+] 1 -0.18 -0.35 

-], +1 3 —3,52 —3.89 

42 1 0.37 0.71 

+3 1 0.96 1.83 

+4 l -0.87 —1.67 

+5 l 0.08 0.15 

+6 l -0.82 -1.57 

+7 1 -0.41 —0.79 

+8 l -0.71 -1.36 

+9 1 -0.07 -0.12 

+10 ] -0.19 —0.36 

+11, +30 20 -0.06 —0.03 

+31, +60 30 ~2.74 —0.97 

+2, +60 59 -4.21 —1.05 


Panel B: CAPE (-1,+1) Descriptive Statistics 


Mean ~3.52% 
Std dev 10.61% 
Minimum —48.8396 
First Quartile -6.34% 
Median -1.49% 
Third Quartile 81% 
Maximum 27.84% 
Percentage negative 64.40%° 


(Continued) 
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TABLE 1 (Continued) 


* The event day (Day 0) is the date of the first announcement of a firm's violation of its debt agreements. For 33 out of 
the 87 firms, the source of announcement is either The Wall Street Journal or the Dow Jones News Service. 
The remaining 54 firms disclose violations in Forms 10-K or ee 

> Market model prediction errors PE, are calculated as PE, =R,- mt)» Where R, e are the continuously 
compounded rates of return on the common stock of firm i and a ma iud NYSE and ASE index on event 
day t. Market model parameters are euren over 300 trading pe from day +61 to +361 relative to the day of 


announcement of violation; CAPE, 2 APE, where APE, = 1/ Ny PE,,. T-statistics are calculated as follows: 


t(CAPE,) = CAPE /(ks*{APE,)), Vire (APE) is the estimated EET of average prediction errors over days +61 
to +160 relative to Day 0 and k is the number of days in the cumulation period. The statistics are distributed 
approximately t with 99 degrees of freedom. The critical values at the .10 and .05 levels of significance are 1.66 and 
1.99, respectively, for a two-tailed test. 

* The hypothesis that the proportion of firms with negative CAPEs is equal to .5 is rejected at the five percent level. 


distribution of prediction errors in the announcement period is slightly negatively skewed, with 
the mean located at the 37th percentile. Fifty-six (64 percent) of the 87 CAPEs are negative, and 
we reject the hypothesis that the proportions of positive and negative abnormal returns are equal 
at the five percent level. Furthermore, trimming five percent (ten percent) extrema from the 
sample yields a CAPE in days —1 to +1 of —3.3396 (—3.06%), suggesting that the mean CAPE 
obtained in days —1 to --1 is not driven by a few observations. The subsequent behavior of CAPE 
suggests that the impact of the violation announcement is permanent. The CAPE for days +2 to 
+60 is not distinguishable from zero at conventional levels.!? 

Our evidence indicates that technical default is associated with significant shareholder 
wealth losses.!! However, it is premature to conclude that technical default is costly because 
potentially confounding events occur simultaneously and create an identification problem. To 
address the problem, a cross-sectional model is presented that controls for contemporaneous 
release of information. 


Regression Model 


We next investigate whether consequences of technical default originating from debt 
covenant changes explain part of the stock price response to announcements of technical default. 
Because default announcements are frequently contaminated by announcements of earnings, 
audit qualifications, and long-term debt reclassifications, we control for confounding releases. 
We specify the model as: 


CPE, = B, + B, FINCOST, + B, LOANRED, + 8, ADDCON, 8, LEVG, 
l +B, WAIVER, + B, UX, + B, AUDQUAL, + 8, RECLASS, + e, (1) 
where 


1? We also assess the statistical significance of the observed abnormal performance using a standardized test statistic. We 
obtain similar results: the statistic equals —1.90 for CAPE (—300, —61), --1.47 for CAPE (—-60, —2), —-5.18 for CAPE 
(—1, +1), and ~.83 for CAPE (42, +60). 

H This finding differs from those in Frost and Bernard (1989), who are unable to observe negative abnormal returns for 
their full cost firms affected by a May 1986 SEC ruling that reduced loan covenant slack. The difference in our results 
probably arises from our examination of cases of default, whereas Frost and Bernard study covenant slack reductions. 
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CPE, = firm i's cumulative prediction error for the three days —1 to +1 surrounding 
technical default announcement. 
FINCOST, = Present value of incremental interest costs arising from increased borrowing 


rates on violated agreements, warrants costs, and cost of servicing refinanced 
debt, deflated by the market value of equity two days prior to default 
announcement. 

LOANRED, = Percentage change in amount of available loan credit, computed as the 
percentage change in maximum allowable borrowing pre- and post-violation. 

ADDCON, = Percentage change in the number of accounting, investing and financing 
constraints following default and renegotiation, measured as: (Constraints 
post-violation — constraints pre-violation)/Constraints pre-violation. 

LEVG, = Ratio of total debt to total assets at the end of year 0 (or in six cases where year 

O data are not available at day 0, at end of year —1). 

1 if, at date of default announcement, firm i has received a waiver of violation 

from its lender, otherwise 0. At the date of technical default announcement, 44 

of 87 firms (51 percent) had waivers. l 

UX, = Unexpected quarterly earnings deflated by the price of a firm’ s common stock 

in day —2. Unexpected quarterly earnings are computed using the Value Line 

Investment Survey and a seasonal random walk if the firm is not in Value Line 

for 33 firms which announce technical default jointly with earnings. 

A control variable coded 1 for contaminated default announcements joint with 

audit qualification for firm i, otherwise 0. 

RECLASS, = Acontrolvariable coded 1 for contaminated default announcements joint with 
reclassification of firm i long-term debt as current liability, otherwise 0. 


AUDQUAL, 


Testing the cross-sectional model requires proxies for consequences of technical default. 
Beneish and Press (1993) report that 61 firms (of 91) renegotiate their debt agreements; they 
identify changes in covenants from debt contracts for 43 firms and use financial statement 
disclosures to identify changes for the remaining 18. We require that data be available at day 0, 
so our measures differ from those in Beneish and Press (1993). Specifically, we find that 21 of 
the 43 firms with renegotiated contracts have not finished renegotiating at day 0." We estimate 
equation (1) either by eliminating these 21 firms, or by coding as “0” any variable based on 
contract term changes for the 21 firms.? We discuss below how we measure the three debt 
contract-based variables. 

The first variable, denoted FINCOST, is based on the increase in debt service cost imposed 
by lenders post-violation, the cost of issuing warrants to lenders as an inducement for rate 
concessions and the incremental cost of servicing refinanced or exchanged debt. Our estimates 
are similar to those in Beneish and Press (1993), and originate from analyses of renegotiated debt 


For the 43 firms with available renegotiated agreements, the mean time from technical default announcement to 
renegotiation is 1.6 months. The range is from 2 months prior to the announcement to eight months after. The fact that 
renegotiation is sometimes completed before the technical default announcement is consistent with Lummer and 
McConnell’s (1989) evidence that lenders have private information about borrowers’ financial affairs. The renegotia- 
tion process for technical default is relatively fast compared to the time to renegotiate for firms in default of debt service 
(15 months in Gilson et al. 1990, table 5). 

We compare the 21 firms to the 66 firms along dimensions of size, leverage, liquidity, and probability of bankruptcy. 
Using Wilcoxon rank-sum tests, we find that the 21 firms differ in that thev have significantly larger bankruptcy 
probabilities. This finding is consistent with renegotiation requiring more time to complete for riskier firms. 
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agreements or Forms 10-K data, except for two differences. First, changes in interest rates must 
be observable by day 0; second, we use loan amounts outstanding at year 0 to calculate the impact 
of increased post-default borrowing costs (Beneish and Press 1993 use year +1 amounts). Since 
FINCOST is measured as discounted incremental cash flows deflated by market value of equity, 
accurate measurement should yield a coefficient estimate equal to-1. We estimate that the average 
increase in the cost of credit represents .96% of the market value of equity prior to default." 

The second variable, LOANRED, is the percentage reduction in the amount of allowable 
borrowing. Subsequent to default, lenders often reduce the amount of allowable borrowing, and 
LOANRED proxies for the shrinkage in firms' investment opportunity sets resulting from 
restrictions on credit. Using data available at day D, we find an average loan reduction of 8.796. 

The third variable, ADDCON, is the percentage change in the number of covenants following 
renegotiation, based on data available at day 0. Beneish and Press (1993, table 6) report numerous 
increases in the number of debt covenants post-violation and show a mean increase of 9.496 inthe 
number of covenants for their sample. Using constraint data observable at day 0, we find a mean 
increase in the number of covenants of 5.4%. Most of the new proscriptions impose limits on 
managerial discretion to make investing and financing decisions. Although we cannot assess 
directly whether all the new covenants are binding, we believe that to be the case for a subset. 
Twelve firms are prohibited from issuing additional debt without creditor permission, and 11 
firms need creditor approval to refinance debt. Eight firms are required to sell assets to repay their 
loans, and six firms must obtain approval prior te divesting assets. 


Estimation Results 


Table 2 presents ordinary least squares estimates of equation (1). The sample sizes in 
specifications 1 and 2 differ because debt contracts are renegotiated after technical default is 
announced for 21 of the 87 defaulters. In specification 1, the variables FINCOST, LOANRED, 
and ADDCON are setto zero for these 21 firms, since changes in contractterms are not observable 
at day O. In specification 2, the model is estimated after deleting the 21 firms. Both regressions 
are significant, with F-statistics (p-values) of 3.38 (.00) and 3.26 (.00). Since the results are 
qualitatively similar, we discuss specification 1 below and refer to specification 2 when the 
differences are of interest.!é 


M Our estimates do not include a fee lenders can charge to renegotiate, which could be imposed by the lead bank in a 
multiple lender agreement. They range between 1/1696 to 1/896 (6.25 to 12.5 basis points) of the credit granted, with 
the fee varying depending on financial condition of the borrower and competition for bank loans. Nine firms in the 
sample negotiated multi-bank agreements, but we were unable to discern whether they paid renegotiation fees. The 
inability to measure this fee is not likely to affect our inference since the fee is so small. 

5 We considered four other variables in lieu of and in addition to the contract-based variables to examine whether the 
model could be improved. First, given evidence in Chen and Wei (1993), we used the probability of bankruptcy in the 
year of defaultto capture variation in costs. Second, we tested a variable measuring the change in bankruptcy probability 
between years —1 and 0 as a proxy for changes in the risk of default. Third, we used a variable indicating the number 
of constraints pre-default to capture variation in the shrinkage of firms' opportunities via added constraints. The 
rationale was that the number of new constraints might depend on the original number of constraints. Fourth, we 
considered the market value of common stock prior to defanlt. This proxy for size was included given evidence that 
smaller firms earn greater positive abnormal returns. None of these variables enhanced the specification of equation (1), 
and they are not reported. 

We perform specification tests to assess the presence of heteroskedasticity in the residuals and multicollinearity in the 
regressors for specification 1. A White (1980) test on the model (excluding the dummy variables) does not reject the 
hypothesis of homoskedasticity. The Chi-square value is 43.1, with 41 degrees of freedom. The probability of a greater 
value is 0.38. Further, only three of the 36 pairwise Pearson correlations between regressors are significant at the five 
percent level. The diagnostic of Belsley et al. (1980) for multicollinearity indicates that the independent variables are 
not collinear. The condition number (the square root of the ratio of the highest to lowest eigenvalue of the instrument 
matrix) of 8.04, is much lower than the benchmark of 30 suggested by Belsley et al. 
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The variable FINCOST tests whether the stock price impact varies according to the 
incremental financing costs imposed by lenders following technical default. The coefficient 
estimate on FINCOST is —.650, significantly different from zero at the five percent level, with a 
t-statistic of —1.69 (one-tailed test, as specified in H,). This result is consistent with investors 
expecting higher costs of financing post-defeult. It corroborates the incremental financing cost 
estimates based on changes in terms of debt contracts from Beneish and Press (1993), since we 
cannot reject the null that the FINCOST coefficient equals —1. 

The LOANRED variable tests the price impact associated with reductions in allowable 
borrowing. If reductions significantly restrict lines of credit, we expect a negative association 
with prediction errors (as indicated in H,). The coefficient estimate on LOANRED of —.006 is not 
distinguishable from zero. Although sample firms have their credit lines reduced an average of 
8.7%, we attribute the lack of significance to the fact that loan reduction eliminates some 
borrowing slack (firms had borrowed an average of 85 percent of the maximum allowable credit) 
without requiring violators to make large repayments. 

ADDCON proxies for the extent to which lenders add restraints on managers, and allows the 
stock price effect to vary according to the relative increase in number of constraints (as specified 
in H,). The coefficient estimate for ADDCON is —.135, significant at the five percent level with 
a t-statistic of —1.79, suggesting that restrictions on firms' opportunity sets are costly. Since the 
average increase in number of covenants (reported above) is 5.496, the mean shareholder wealth 
loss from added restrictions is — 7396 (.054 times —.135). If ADDCON also proxies for value- 
preserving benefits from increased lender monitoring and control, the benefits seem to be 
outweighed by the restricting of firms’ opportunities from new constraints. Note that, by 
construction, ADDCON treats added constraints as if they are equally restrictive. 

The LEVG variable is included to test the assumption in prior research that the relative level 
of debtisrelated to the expected costs of default (hypothesis 5). The coefficient estimate on LEVG 
of .012 is not distinguishable from zero. The lack of significance on the variable is subject to two 
possible i interpretations. 17 First, LEVG has no explanatory power over that of the contract-based 
variables. Second, it is possible that, following Smith (1993), regressions using leverage lévels 
are misspecified because optimal debt levels vary across firms with differing investment 
opportunity sets. Both explanations are investigated below. 

Other information is released concurrently with technical default announcements. The 
WAIVER variable allows the stock price effect to differ according to whether firms report that 
lenders suspend their contractual rights for a period of time. The coefficient estimate on WAIVER 
is positive and significant at the five percent level (.054, t-statistic = 2.23). This finding is 
consistent with waivers reflecting lenders’ private information that a borrower is worthy of credit 
continuation. It is also consistent with Chen and Wei’s (1993) finding that lenders grant waivers 
to technical defaulters with better future prospects. In specification 2, we drop 21 firms without 
data observable at day 0 and the WAIVER variable does not attain significance. A possible 
explanation is that waivers matter more for the 21 firms. They require more time to renegotiate 
since, as previously noted, these firms have higher bankruptcy probabilities. 

Wecontrol for the two most common contaminants we observe: earnings releases and auditor 
decisions.!? Earnings are concurrent with technical default announcements for 33 of the 87 firms. 


7 Similar results obtain with three alternative measures of leverage, the ratios of long-term « debt to total assets, total debt 
to equity, and total debt to market-value of equity. 

!* We also considered controlling for other, less frequent events. We included dummy variables Bauen firms that had 
staff reductions and asset sales; neither had a significant impact nor altered our findings. 
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Twenty-six (79 percent) of the 33 forecast errors are negative; seven are positive. The coefficient 
on UX is positive and significant at the five percent level (.189; t-statistic =2.24). While consistent 
with the empirical regularity that negative earnings surprises are bad news, the coefficient 
estimate is low relative to prior research. This likely reflects the facts that some of the earnings 
surprises are large, and that not all firms have concurrent earnings signals. Thirteen announce- 
ments of audit qualification, and 15 announcements of reclassification of long-term liabilities to 
short-term debt occur at day 0. The coefficient on RECLASS is negative and significant, 
consistent with debt reclassification signaling adverse cash flow effects from potential accelera- 
tion of debt; AUDQUAL does not attain significance in specification 1: 


Examination of Leverage Results 


In table 3, we investigate potential explanations for the results on the leverage variable. 
Specifically, table 2 estimation results of equation (1) indicate that the leverage variable had no 
explanatory power over the contract-based variables. Since prior research uses leverage as a 
surrogate for the constraints imposed by debt covenants, we compare a specification that includes 
the FINCOST, ADDCON, and LEVG variables to one that contains LEVG only. Testing this in 
table 3, we find that the coefficient on LEVG is still not significant when the contract-based 
variables are deleted, suggesting that leverage is an unsuitable proxy for default costs. The 
exclusion test between specifications 1 and 2 indicates that the regression i is better specified when 
the contract-based variables are included.” 

We also reproduce our tests using a leverage change variable (LEVG, — LEVG ,) in 
specifications 3 and 4. Regressions using leverage levels may be misspecified because optimal 
debt levels can vary across firms, given different investment opportunities and asset structures. 
That is, using a leverage level variable treats a firm with leverage of 60 percent as twice as risky 
as one with leverage of 30 percent, even though the former may be a food distributor and the latter 
a steel producer. 

The estimated coefficients on the leverage change variable (—.038, t-statistic = —1.69) are 
identical in both specifications 3 and 4. The negative coefficient indicates a more adverse stock 
price impact for firms with greater leverage increases from year —1 to year 0.? Assuming that firm 
leverage follows a random walk, the result can be interpreted as the stock price reaction to an 
adverse leverage forecast error. Comparing both specifications, we find that in specification 3 
leverage change attains significance in conjunction with the debt contract-based variables. When 
the contract-based variables are removed in specification 4, we obtain the same coefficient 
estimate, suggesting that the change in leverage is not likely to proxy for default or renegotiation 
costs. This finding is corroborated by the low correlation between the contract-based variables 
and change in leverage (Pearson r = 0.19 with FINCOST; and 0.11 with ADDCON). Since 
leverage change is associated with the valuation effects of technical default, it may be that the 


1? This result seems at odds with the evidence in Beneish and Press (1993, 247) that leverage is significantly associated 
with the incremental financing costs imposed by technical default. However, their evidence is based on a sub-sample 
of 48 violators for which they can ascertain whether borrowing rates are renegotiated. It is possible that leverage only 
serves as a good proxy for incremental interest costs when there is indeed renegotiation. 

? Similar results obtain when we compute leverage as the ratio of total debt ta market value of equity. However, the 
coefficients are not significant when we define leverage as either long-term debt/total assets or total debt/equity. The 
first ratio, which measures only long term debt, potentially understates leverage since many firms in technical default 
are required by auditors to reclassify their long term debt to current liabilities. Further, because technical defaulters are 
in financial distress, thc second measure may suffer from a small or negative denominator problem that biases against 
finding a significant relation (e.g., al percent of defaulters have book equity less than $10 million, and equity is negative 
for 13 percent of them). 
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variable proxies for riskier future cash flows from operations or reduced investment opportunities 
(Press and Weintrop 1990, 90-91; Skinner 1993, 416). These conjectures may be topics for future 
research. 

Overall, we find that technical default is costly to shareholders. In particular, default and 
renegotiation costs originate from post-default financing costs and from imposition of additional 
constraints that shrink firms’ opportunity sets. Our model estimates that default and renegotiation 
costs reflected in stock prices represent an average of 1.4% of the market value of equity. Because 
our evidence derives from an examination of changes in debt contract terms, we are able to 
demonstrate that tests for debt covenant effects are better specified when contract data, rather than 
leverage proxies, are used. 


VI. SUMMARY AND CONCLUSION 


Tests of the debt covenant hypothesis in prior research have focused on accounting 
measurement changes. However, since financial distress rather than measurement changes 
causes technical default, prior research provides limited evidence on technical default costs. We 
study the resolution of technical default, a setting where default and renegotiation outcomes can 
be observed and their stock price effects assessed. We combine post-default changes in terms of 
lending agreements with stock returns to examine whether the consequences arising from 
renegotiation of debt contracts in technical default are priced in the market. Specifically, higher 
costs of borrowing and new limitations on firms’ opportunity sets negatively impact shareholder 
wealth. We also find that leverage does not capture costs of default in cross-sectional regressions 
with abnormal returns as the dependent variable. This finding holds even when we apply leverage 
measures that control for differences in capital structures across firms. We conclude that tests for 
debt covenant effects are better specified using data drawn from lending agreements. 

In interpreting the results in the paper, we note that the sample likely represents the tail of the 
population of firms in technical default. Current GAAP disclosure rules make it impossible to 
assess how many covenant defaults occur relative to how many are reported. Given this caveat, 
our evidence that technical default costs are reflected in stock prices confirms the usefulness of 
analyzing market reactions to test predictions of the debt covenant hypothesis. In addition, our 
evidence suggests that leverage may be signalling information about firms’ future prospects. It 
is an avenue for further research to investigate whether adjusted measures of leverage are 
correlated with cash flow realizations subsequent to default. 
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R.H. PARKER and B.S. YAMEY (eds.), Accounting History: Some British Contributions 
(Oxford: Clarendon Press, 1994, pp. ix, 661, $72.00). 


This collection of articles is presented as a testimonial to the British contribution to accounting 
historical writing. The 23 essays contained herein have two elements of commonality—they are all authored 
by British historians and they all have appeared in the literature relatively recently (since 1965). The volume 
is conveniently divided into eight subject areas with the number of contributions to each stated in 
parentheses: The Ancient World (2), Before Double Entry (5), Double Entry (3), Corporate Accounting (3), 
Local Government Accounting (1), Cost and Management Accounting (4), Accounting Theory (2), and 
Accounting in Context (3). Additionally, the editors have provided a short but useful introduction to British 
accounting historiography, drawing upon their vast experience in the field. 

Alibutone of the pieces (Professor Y amey's “Balancing and Closing the Ledger: Italian Practice, 1300- 
1600") have appeared previously in some form. However, many ofthe contributions have been revised since 
original publication under the careful tutelage of the editors. Other authors have been content to let their 
work stand without amendment, while in some instances postscripts have been allowed for an author's 
further reflection on the topic. 

Because the prominence of both the historians represented and the articles themselves is without 
question, this reviewer believes analysis of the essays on an individual basis unnecessary. An issue that 
needs to be addressed, however, is the editorial decision making process which resulted in the specific 
composition of this compendium. The mesh appears appropriate from a number of perspectives. There is 
adequate representation from the economic history discipline. The works of five historians with particular 
expertise in or awareness of accounting issues are included. The current academic postings of the . 
participants are at 19 different institutions of higher education, all but two in the U.K. Contributed articles 
have appeared in 11 different academic journals or as chapters in three monographs. Notwithstanding, the 
backgrounds of the editors are not belied by this wide distribution. Five articles have been authored by past 
or present faculty at the London School of Economics where Professor Yamey spent his illustrious career. 
Indeed, the LSE remains one of the few venues in the world where students have the option to take a course 
in accounting history. Five of the essays are reprinted from Accounting and Business Research to which 
journal Professor Parker brought editorial support for historical research. 

My only criticism of the selection process, but a major one in my view, is the absence of "critical" 
scholarship, currently an important focus for many British academics. There are no articles reflecting the 
research efforts of theorists within tbe context of either the Marxist/labor process or Foucauldian paradigms. 
I would have liked to have seen work by Hopper and Armstrong, Hoskin and Macve, Loft, Miller, and/or 
others representative of the “new accounting history.” I fear that one of the editors’ stated goals, to bring 
"traditional" accounting historians and “new” accounting historians together in a greater awareness of each 
other's work (p. 10), will not achieve fruition given this omission. 

The volume is beautifully presented, meticulously edited, and well-indexed, justifying its hefty price 
tag. I came away with the wish that I could have been born British so that my work might then have been 
at least considered for inclusion. 

RICHARD K. FLEISCHMAN 
Professor of Accountancy 
John Carroll University 
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PAULINE WEETMAN, BILL COLLINS and ELIZABETH DAVIE, Operating and Financial 
Review: Views of Analysts and Institutional Investors (Edinburgh: The Institute of Chartered | 
Accountants of Scotland, 1994, pp. viii, 108, £12.50). 


This research report is the first stage in the plans of the Research Committee of the Institute of Chartered 
Accountants of Scotland to investigate several aspects of the use of published financial information. The 
monograph, by three British academics, reports the results of an in-depth structured interview study seeking 
the views of analysts and institutional investors on specific aspects of the content and presentation of the 
“Operating and Financial Review” (OFR). This is a non-mandatory approach for management communi- 
cation with annual report users, suggested in a Statement of the U.K. Accounting Standards Board in July 
of 1993. Basically, the OFR Statement encourages companies to provide additional information on the main 
factors underlying their financial performance and position, some of the essential features of which are:. 
explanations and reasons for change; discussion of historical trends and known events; and comments on 
trends and uncertainties expected to impact upon the firm in the future. The proposed contents of the OFR 
reflect many features of the Management Discussion and Analysis (MD&A) required by the SEC in Form 
10-K and included in the annual report of many U.S. companies. 

The discussion paper (April, 1992) preceding the release of the ASB Statement prompted 104 replies 
on the initial proposals contained therein, and itis these responses and their perceived “preparer bias” which 
provide the foundation and give direction to the research. Indeed, a claimed secondary result of analysis of 
these responses is to raise questions about the effectiveness of the process by which the ASB is lobbied and 
the potential for bias where numbers of respondents are unevenly distributed between users and preparers. 
Nevertheless, the primary purpose of the study is to determine the usefulness of the OFR to professional users 
of the annual report, by going directly to a “horse’s mouth” rather than a “horse’s proxy" so to speak. Hence 
users are directly consulted in 20 interviews with 22 senior staff of leading U.K. brokers (analysts) and 
institutional investors (fund managers, investment research staff) on questions structured around five 
“themes” or “user-oriented issues." These issues were identified from the user-oriented (but essentially 
preparer-sourced) responses to the discussion paper, and restated in question form are: Is there really a user 
need for such disclosures? Will voluntary compliance work? Is commercially sensitive/confidential 
information involved? Should forward-looking information be avoided as too problematic? Is the approach 
too prescriptive and does it require too much detai]? 

The five issues specifically addressed are all familiar ones which repeatedly arise when additional 
disclosure is proposed, not just in the United Kingdom but just about everywhere. Yet the questions asked 
and the answers provided actually range across a much wider and sometimes even more interesting ground, 
extending to how information is disseminated within an industry, the impact of the legal environment on the 
evolution of reporting standards, the nature of the management/analyst relationship, and so on. The authors 
maintain that “it would be fascinating to publish the interview texts verbatim" (p. 25), and this might well 
be true judging from the limited quotations provided in the report. 

The wide range of generic and germane topics touched upon, and the high level and highly informed 
sources of insight drawn upon, should make this monograph of interest to a broad readership. It is also of 
potential use in the classroom, particularly with Ph.D. students, 8s it is a reasonably well constructed, 
readable and revealing example of a research approach now rarely used in financial accounting research in 
the U.S, Moreover, it is conducted in a manner and style more typically European, with what may be 
described as “passionate objectivity,” and thus can serve in refreshing (or annoying) contrast to the more 
reserved and reticent "safety in numbers" approach typically taken in U.S. academic research. 

DENNIS H. PATZ 
Professor of Accounting 
Oklahoma State University 


TAKEO YOSHIKAWA, FALCONER MITCHELL, and JIM MOYES, A Review of Japanese 
Management Accounting Literature and Bibliography (London: The Chartered Institute of 
Management Accountants, 1994, pp. vii, 89). 


The study of Japanese management, and more recently, Japanese management accounting has received 
considerable attention from Western academics. Are Japanese academics equally fascinated with the 
nuances of their accounting and control systems, or is the majority of this research driven by Western 
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interests alone? A Review of Japanese Management Accounting Literature and Bibliography begins to 
- answer this question. 
^ This book has two stated aims. First, to summarize the Western literature on Japanese management 
"accounting philosophies and practices, including the work of Western as well as Japanese researchers 
reporting in Western journals and books. Second, to provide a more direct examination of Japanese 
management accounting by reviewing the indigenous Japanese literature on cost and management 
accounting. The book is intended to provide an initial source of information for those interested in the topic. 
To accomplish the first aim, the authors provide a brief review of some of the Japanese management 
- accounting literature which has been published in the Western press, identifying the main characteristics 
which the. Western literature attributes to Japanese management accounting, and noting some important 
sources which are useful in starting a literature review on the topic. While the summary is interestingly 
written, and provides a good initiation for researchers new to this topic, it is by no means exhaustive. Because 
there is considerable overlap between the management accounting, international business and management 
literature with respect to Japanese management practices, researchers seriously pursuing the topic of 
Japanese management accounting will need to conduct their own literature review, and must search a wider 
range of journals than is included in this book to get a complete picture of what has been published in the 
Western press relating to Japanese management accounting. 

. By undertaking the second aim, the authors set out to provide an important and unique resource for 
management accounting research which has heretofore been inaccessible to non-Japanese speaking 
researchers. To accomplish this, the authors used the Japanese data base called "NACSIS-IR" which is 
maintained by the National Center for Science Information Systems. This data base contains 900 Japanese 
journals in the fields of economics, business administration, and statistics. The authors examined publica- 
tions for the eight and a half year period beginning in 1983 and ending in mid-1991, translating the title, and 
where necessary, the abstract or content of 785 management accounting papers. The book categorizes these 
papers into eight broad topics, and provides a brief summary of the issues researched within each category. 
Supplementing this summary is an appendix listing the journal articles by year, with indications of whether 
the author was an academic or practitioner, whether the journal was an academic, professional orcommercial 
publication, and the category/topic of the paper. Using this appendix, readers can identify Japanese 
researchers and manuscripts which pertain to the management accounting topic they want to address. In 
addition to this appendix, there is another listing which provides an English translation of the titles of the 
93 cost and management accounting books from tbe current catalogue of Chuokeizaisha Publishing Co., a 
leading Japanese publisher. 

The review of the Japanese literature is an important step in providing Western researchers access to 
Japanese language materials on the topic. However, it would have been useful for the authors to provide a 
summary of the Japanese literature in a form more parallel to their Western literature summary. After reading 
about the characteristics that the Western literature has attributed to Japanese management accounting, one 
would like to know whether the J apanese literature confirms these perceptions, the specific Japanese sources 
that address the issues identified in the Western research, and the findings of these papers. This summary 
could have been included in addition to the classification of the broader list of articles covering the whole 
range of management accounting. 

Any review of the Japanese literature should also be viewed with an awareness of the characteristics 
of Japanese business and academia. As documented in the book, 99 percent of the Japanese literature is 
generated by academics, and most of this is published in university publications which are exclusive outlets 
for their own staff. This suggests a lack of cross-fertilization between academia and practice, and a lack of 
peer review in the research process in Japan. Joint research between Western and Japanese academics and 
between academics and practitioners is needed to promote a more complete understanding of Japanese 
management accounting and how it compares to practices in the West. 

SHIRLEY J. DANIEL 
Associate Professor of Accountancy 
University of Hawaii 
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on manufacturing overhead costs (MOHC) in three plants of a textile firm. An 
approach for measuring PMH is adapted from the group technology literature of 
operations. Factor analysis of product engineering specifications identifies seven 
forms of PMH for woven fabrics. Regression analysis indicates that two of the seven 
forms of heterogeneity are costly: differences in processing efficiency and In 
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L INTRODUCTION 


perations management researchers define manufacturing flexibility as the ability to 
O produce a wide range of continually changing products with minimal degradation of 

performance.! One measure of manufacturing flexibility is the impact of realized 
product demands on cost (Son and Park 1987; Gupta and Goyal 1989; Roll et al. 1992).? Recent 
surveys document managers’ beliefs that the ability to produce diverse products at low cost is a 
critical manufacturing capability for future success (Slack 1987; De Meyer et al. 1989). However, 
few studies have examined the nature of product diversity and its relation to cost in the absence 
of perfect flexibility. This research develops measures of product mix heterogeneity (PMH) that 
explicitly capture similarities and differences among products and uses these measures to 
estimate more precisely the relation between PMH and manufacturing overhead costs (MOHC) 
in three plants of a textile manufacturer. 

Theories in economics, operations management and management accounting predict that 
producing a heterogeneous product mix increases costs and reduces operating performance 
(Skinner 1974; Panzar and Willig 1977, 1981; Hayes and Wheelwright 1984; Hill 1985; Johnson 
and Kaplan 1987; Cooper and Kaplan 1987; Karmarkar and Kekre 1987; Banker et al. 1988) as 
aresult of transactions caused by complex material flows, capacity balancing, quality control, and 
change (Miller and Vollmann 1985). However, empirical studies have not found consistent 
evidence of a link between PMH and MOHC (Foster and Gupta 1990; Kekre and Srinivasan 1990; 
Banker et al. 1990, 1992; Banker and Johnston 1993; Datar et al. 1993). The absence of a 
systematic relation between PMH and MOHC may be caused by limitations of the variables 
typically used to capture PMH. Variables commonly used to proxy for the range of products 
produced (e.g., number of products produced), changes to existing products (e.g., number of 
engineering changes), and additions of products (e.g., number of product introductions) fail to 
distinguish similarities and differences among products. Consequently, even studies that find the 
hypothesized relation between these proxies and. MOHC offer no insight into sources of 
inflexibility. The prominence of product similarities and differences in theories of economies of 
scope, focused factories, and activity-based costing, suggests that this failure may lead to poorly 
specified tests. 

The group technology field of operations management has developed methods for measuring 
similarities and differences among products. This research advances the measurement of PMH 
by using these methods to better estimate the relation between PMH and MOHC. Specifically, 
factor analysis is used to identify seven independent sources of PMH from the underlying 
engineering specifications of woven fabric products. Regression analysis is used to examine the 
relation between these sources of PMH and MOHC in three textile weaving plants during 1986- 
90. The results indicate thattwo ofthe seven forms of PMH are particularly costly—heterogeneity 
that generates differences in the efficiency with which the production process is operated and 
heterogeneity in customer-specified quality requirements. The remaining five forms of PMH, 
which stem from differences in raw materials and combinations thereof, are not associated with 
increased MOHC in these plants. Finally, the analysis provides evidence of “‘variety-based 


! See Sethi and Sethi (1990) for a comprehensive review of the manufacturing flexibility literature. 

? Stigler (1939) defines a flexible technology as one with a relatively flatter average cost curve over a wide range of output 
quantities. Although a less flexible technology might offer lower average costs at some particular output, X", in the 
presence of demand uncertainty that is resolved after the technology choice is made the more flexible technology offers 
lower expected costs for the distribution of expected demand (Marschak and Nelson 1962). 
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learning"— experience producing a wide range of continually changing products mitigates the 
effect of PMH on MOHC. Adler (1988, 51) claims: 


For managers, flexibility is potentially advantageous—and indeed, only becomes 
meaningful as a concept—against a backdrop of stabilities. The managerial question is 
therefore not simply how to reduce rigidities, but how to find the right mix of stabilities 
and flexibilities. 


By estimating the relation between different forms of PMH and MOHC, this research provides 
a model for how management accountants can help managers identify the right mix of “stabilities 
and flexibilities." | | 

The paper is organized in five sections. Section II reviews previous efforts to measure the 
impact of PMH on MOHC and suggests an alternative approach for measuring PMH. Section III 
describes the research sites and Section IV describes the measurement of variables. Tests of the 
relation between PMH and MOHC are in Section V and Section VI summarizes. 


IL LITERATURE REVIEW AND THEORY DEVELOPMENT 


The Economics of Multi-Product Production 


Economies of scope exist when the cost of producing N products in a single facility is less 
than the sum of costs to produce the same products in N single-product facilities (Panzar and 
Willig 1977, 1981). When large fixed costs accompany production, when individual product 
demand is insufficient to fully utilize these fixed resources, and when application-specific 
resources are not readily "rented" to an alternate user, shared fixed resources generate economies 
of scope in multi-product production (Teece 1980). While shared, application-specific fixed costs 
create economies of scope, cost interactions may increase (diseconomies) or decrease (econo- 
mies) total production costs (Gorman 1985). Empirical efforts in economics attempt to assess 
economies of scope by econometric estimation of the joint production cost function (Baumol and 
Braunstein 1977; Pulley and Braunstein 1992)? However, in production environments with 
hundreds of products produced in different combinations, data limitations often make it 
impossible to fit a joint cost function. 

Accounting researchers circumvent this problem by adopting the more modest goal of 
estimating plant-level MOHC—an interaction cost hypothesized to be caused by PMH—as a 
function of characteristics of the aggregate product mix. This approach circumvents the problem 
of estimating coefficients for product-specific cost interactions, but creates the need for measures 
of PMH. Two types of proxies have been used with mixed results: measures of product mix 
breadth and change (e.g., number of products or number of production batches) and measures of 
the results of PMH (e.g.,engineering changes, batch sizes, and cycle times). Kekre and Srinivasan 
(1990) and Foster and Gupta (1990) find no evidence that a broader product line is correlated with 
higher MOHC. Banker et al. (1992) find evidence of a relation between MOHC and variables that 
indicate the presence of PMH but no relation between MOHC and the number of products 
produced. None of these proxies for PMH measure product similarities and differences, the 
hypothesized source of economies of scope and the basis for factory focus (Skinner 1974; Hill 
1985). A 


> In estimating the joint cost function, total costs are regressed against output for cach product and the cross-products of 
output for every combination of products. The functional form depends upon assumptions about the firm’s production 
function. 


366 The Accounting Review, July 1995 


Cooper et al. (1991) address product similarities by substituting complexity-adjusted output 
for unadjusted aggregate output in their estimation of the cost function. They find that complexity- 
adjusted output outperforms unadjusted output. However, their study focuses on a plant with few 
products, all of which are generations of a single product. The complexity-adjustment factors are 
based ona single product attribute (design yield) that increases with each product generation. This 
creates a straightforward means for aggregating outputs of different generations. The question 
that remains is how to measure PMH in a multi-product environment where products differ on 
many dimensions, each with different implications for manufacturing cost.‘ 


Product Mix Heterogeneity and Manufacturing Overhead Costs 


In the operations literature, group technology was developed as a means for classifying 
products on the basis of production and design similarities. Products are assumed to be uniquely 
described by a well-defined but limited set of N attributes (Hyer and Wemmerlov 1984). Wilson 
and Henry (1977) identify two categories of attributes used to assess similarity for purposes of 
group formation: “graphical” data describe the product in engineering terms, and “manufactur- 
ing” data describe process specifications for producing the product. The relevant product 
attributes for estimating the relation between PMH and MOHC include both graphical parameters 
and process parameters (Miller and Vollmann 1985; Cooper and Kaplan 1987). If a product is 
defined by N attributes, PMH arises when the N-attribute vectors of products produced by a plant 
differ. PMH may arise with either simultaneous (e.g., parallel operations) or sequential produc- 
tion of different products—denoted simultaneous PMH and sequential PMH, respectively. 

The maintained hypothesis is that, in facilities that are not perfectly flexible, both forms of 
PMH are costly with respect to one or more of the N product attributes. Skinner based his 
arguments for factory focus on the belief that all forms of PMH engender confusion and goal 
incongruence among production workers and create demands on management to resolve the 
ensuing conflicts. Proponents of cell manufacturing—the practice of applying group technology 
to classify and co-locate products with similar manufacturing requirements—argue that cells 
producing similar products have higher performance as a result of simplified material flows, 
reduced needs for coordination and concentrated responsibility and authority at the point of 
production (Burbidge 1989). In addition to these effects, sequential PMH is associated with fixed 
and sequence-dependent effects of setup. Sequential PMH necessitates machine setups between 
batches of different products. Although a fixed level of downtime is typically associated with all 
setups, an additional variable component of setup time may depend on the “origin” and 
“destination” products (Bitran and Gilbert 1990). The minimum cost path of producing several 
products on a machine is a sequence that maximizes similarity of adjacent products according to 
one or several attributes that are critical to the setup process. 

This research provides descriptive evidence on the relative costs of simultaneous and 
sequential PMH along each of the N product attribute dimensions and compares measures of 
PMH that capture similarities and differences among products to a traditional proxy for PMH, the 
number of products produced, in estimating MOHC. The general form of the estimated model is: 


MOHC = f (simultaneous PMH, sequential PMH, covariates) (1) 


* Banker et al. (1990) and Datar et al. (1993) take a different approach, first assigning MOHC to products and then 
estimating the relation of revised product costs and product characteristics. Using revised product costs as the unit of 
analysis, they treat MOHC as separable in products and estimate costs of product complexity rather than joint costs of 
product mix heterogeneity. 
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where ‘covariates’ include prior-to-study and contemporaneous conditions that influence MOHC 
in addition to PMH (Kinney 1986). The operations literature suggests that important covariates 
include: the capability of process technologies, input quality, and input quantity (Sethi and Sethi 
1990). For a given production technology, input quality, and input quantity, MOHC is hypoth- 
esized to be lowest when products are tightlv clustered in attribute space with little change from 
period to period—products reflect a consistent set of manufacturing priorities, fixed and 
sequence-dependent setup costs are minimized and cross-product learning is maximized. 
Manufacturing flexibility—or the best mix of “stabilities and flexibilities’—is achieved by 
producing products that are heterogeneous along dimensions that are not costly or by becoming 
better at accommodating costly forms of PMH. A plant is more flexible than a second plant if its 
marginal cost of increased PMH is lower than that of the second plant. 


Ill. RESEARCH SITES 


The research sites are weaving plants of a leading U.S. textile manufacturer. During the 
period of this study, 1986-1990, the firm was recognized in the trade and business press as being 
well managed. Top management was motivated to participate in this study because they wanted 
to identify how product proliferation had affected their production costs. They selected the 
Woven Fabrics Division for study because historically it has been the most important business 
for the firm. The author visited six plants and was permitted to select three for study. Plant 
selection was based on differences in PMH and similarities in process technology, input quality 
and input quantity. 

The three weaving plants, referred to as A, B, and C, are cost centers of the Woven Fabrics 
Division. Weaving is the process of interlacing lengthwise warp threads and cross-wise fill 
threads at right angles (figure 1). Warp threads are wound onto a metal core, called a warp beam, 
in an upstream process. During machine setup, the warp beam is mounted in the loom in an 
operation known as a draw. Alternatively, if a second batch of a product is to be produced, the 
threads of the new warp beam are tied to those of the exhausted beam and pulled through the loom. 
This minor form of setup is called a tie. Draws and ties are one manifestation of sequence- 
dependent setup times in weaving. The loom raises and lowers alternate warp yarns and inserts 
the fill thread to form a panel of fabric. The finished fabric is wound onto a cloth beam, and the 
fabric is visually inspected, packed and sh:ipped—typically to a fabric finishing plant. 

The firm operates many weaving plants with similar technical capabilities and limits each 
plant's product range through a focused factory strategy that assigns each plant a raw material 
specialty. Plants A, B, and C specialize in inputs 1, 2 and 3, respectively. In addition to its 
specialty, Plant C is a "swing" plant, used to balance capacity utilization of the three plants. 
Management limits PMH through facilities focus, but from 1986 to 1990, both the range of 
products produced each four-week period—simultaneous PMH—and instability of the product 
mix from period to period—sequential PMH—increased. These trends are illustrated in table 1. 

Controlling for minor differences in plant scale (number of looms), panel A of table 1 
indicates that each plant experienced increased simultaneous PMH, with Plants B and C having 
the largest number of unique warps by the end of the period. Panel B characterizes simultaneous 
PMH using the number of unique warp-fill combinations, and again, PMH increased for all of the 
plants, although the increase took different forms. Comparing panel A to panel B, Plant B 
proliferated products by combining different fill threads with existing warp beams. The ratio of 
warp-fill combinations (panel B) to warps (panel A) indicates that, on average, warps are 
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FIGURE 1 l 
Weaving Process Diagram 
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combined with two fills in Plant B, as compared to 1.5 for Plants A and C. In the language of 
- manufacturing, Plant B proliferated products using common components.’ 

Panel C of table 1 provides a simple measure of sequential PMH, product turnover, computed 
by dividing the number of products produced in five years by the average number produced each 
period. Product mix change surfaces in all three plants; however, Plant C was marked by 
significantly more change than À or B. When unique warp beam and fill thread combinations are 
considered, Plant C produced 7.6 entirely different product portfolios from 1986 to 1990. In 
contrast, Plants A and B produced 4.8 and 5.4 product portfolios, respectively in the same period.‘ 

To summarize, the plants experienced different levels of simultaneous and sequential PMH. 
Plant A produced a small number of heterogeneous products and experienced the least change in 
product mix from period to period—Trelatively low simultaneous and sequential PMH. Plant B 


* One facet of PMH not addressed by table 1 is the distribution of products over looms. The share of operating machine 
hours dedicated to the highest volume product revealed no difference among the plants. Approximately 20 percent of 
capacity was devoted to producing the highest volume product in 1986, falling to 17 percent by 1990. Another aspect 
of PMH not addressed in table 1 is product generations. In 1986 one third of the products that Plant B produced were 
generations of existing products. This compares to one tenth and one seventh for Plants A and C, respectively. 
Differences are less pronounced but still present by 1990. Thus Plant B's product mix is more homogeneous than Plant 
A or C. 

$ An empirical question is whether "change" reflects introductions of new products or production discontinuities that 
interrupt a product's life cycle. For example, panel B of table 1 indicates that Plant C produced 404 products in five 
years, on average producing 53 different products each period. This might mean that products run for 8.6 periods (65 
periods/7.6 turns) before being discontinued. Alternatively products might run one peziod every 7.6 months, a total of 
8.6 periods over five years. Comparing the duration of continuous product runs, short product runs are prevalent for all 
three plants; approximately ten percent of all production runs were completed in one period and 50 percent were 
completed in fewer than four periods. This is consistent with the firm's commitment to just-in-time production 
schedules. 
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TABLE 1 
Average Product Mix Heterogeneity 
Panel A: Average number of unique warp beams in production in the 13, four-week production periods of 


1986—1990 scaled for plant size. 
1986 1987 ` 1988 1989 1990 
Plant A 23 24 24 29 29 
Plant B 14 15 22 25 35 
Plant C 21 31 28 27 35 


Panel B: Average number of unique warp beam-fill thread combinations in production in the 13, four- 
week production periods of 1986—1990 scaled for plant size. 


1986 1987 1988 1989 1990 
Plant A 31 29 30 42 37 
Plant B 30 30 42 51 67 
Plant C 35 54 55 56 59 


Panel C: The average number of times the product mix changed during 1986—1990, calculated by dividing | 
the total number of products produced by the average number of products produced in a 


four-week period. 
Total Number Products Avg. Number Product Mix 

Product- - 1986—90 Products/Period Turnover 
Warp 

Plant A TI 25 3.0 

Plant B 62 22 2.8 

Plant C 155 29 5.3 
Warp-Fill 

Plant A 162 34 . 48 

Plant B 237 44 5.4 

Plant C 404 53 7.6 


went from having the fewest to the most products—high simultaneous PMH; but sequential PMH 
was controlled by proliferating through incremental change to existing products. In contrast, Plant 
C experienced the same dramatic growth in the number of products as Plant B but was unable to 
limit either simultaneous or sequential PMH; products produced were dissimilar within a period 
and changed from period to period. This volatility is consistent with its role as the “swing” plant 
in the firm's factory focus strategy. Figure 2 depicts the research design based on similarities and 
differences in the level of simultaneous and sequential PMH of the plants. 

Plant MOHC may be influenced by differences in process technology, input quality, or input 
quantity. The three plants were built within a decade of each other and, after spending several 
weeks in each plant over a period of two years, this researcher found little to distinguish the 
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physical facilities or infrastructure. The plants use the same advanced weaving technology and 
the production equipment is the same make and vintage—acquired at the same time (mid-1984) 
from the same vendor. Plant B has the largest number of looms, L, followed by Plant C, with .91L 
and Plant A with .83L —thus, size differences are negligible. 

Located in small towns in the southeastern United States, the plants employ nonunion 
workers and pay identical wages for comparable job classifications. Self-sufficient production 
teams of four to five employees are responsible for approximately 100 looms each, and (with few 
exceptions) the plants operated 24 hours per day, seven days per week throughout the period. 
Absenteeism was less than one percent annually during 1986-90. 

As aresult of normal promotions, the plants have been marked by high management turnover: 
Plants A and B had four and Plant C three plant managers from 1986 to 1990. In contrast, the 
engineering and administrative staffs were stable. The management group at each site is small and 
the mix of job titles is uniform across plants. Consistent with industry average, turnover among 
hourly employees was 20-25 percent annually for all three plants. 

Input quantity per unit output is a function of initial endowments, subsequent demands, and 
the ability to alter resource endowments to match demand. As demonstrated above, the plants had 
similar initial endowments of productive resources and no major investments during the period 
altered that initial endowment. However, subsequent demands placed on the resources differed. 
In particular, the plants experienced different levels of capacity utilization during 1986-90. Plant 
A experienced four periods during which utilization dropped below 60 percent and utilization was 
erratic over time. Utilization at Plants B and C rarely dropped below 80 percent and exhibited 
smooth transitions over time. Company workforce policies and fixed capital resources limit 
managers' opportunities to adjust resource endowments to match demand in the short run. To 
control for the impact on MOHC of managers' efforts to reduce inputs in response to sustained 
reductions in market demand, a measure of excess capacity is included as an independent variable 
in the empirical tests that follow." 

To summarize, the research sites were selected to mitigate differences in covariates that 
influence MOHC and to maximize variation of PMH both between plants and over time. The 
description of the sites is based on comparing the plants' statistics, touring the facilities, 


? There is of course a danger that multicollinearity between PMH and capacity utilization may limit the ability to draw 
inferences about the separate effects of PMH and excess capacity on MOHC. Managers in all of the plants talked about 
the sales force accepting more “cats and dogs”—small lots of customized products —yielding greater PMH during 
periods of slack demand. 
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interviewing local managers and workers, and interviewing top management—including the 
Vice-president of Manufacturing and the Chief Financial Officer. There is no evidence that any 
plant received assets or resources to uniquely equip it to be better able to cope with PMH, despite 
the fact that historically they have faced different levels and types of PMH. 


IV. MEASUREMENT OF VARIABLES 


Identifying Product Attributes 


Extensive interviews with engineers and production workers revealed that product specifi- 
cations for woven fabrics fall into four categories: those describing 1) warp and fill threads, 2) 
warp beam construction, 3) how the warp and fill are combined, termed “fabric construction,” and 
4) product-specific process specifications (appendix A). Each category includes several param- 
eters. It was evident in employees’ descriptions of woven fabric that the specifications are not 
independent and are too numerous to incorporate simultaneously in tests of the impact of product 
mix on MOHC. Common factor analysis is used to reduce the product specifications to a 
parsimonious set of attributes that retain the information of the original data (Harmon 1976; 
Rummel 1970).? 

The maximum likelihood method is used to identify seven independent product attributes 
from engineering specifications of all products produced from 1986 to 1990 (table 2). The factor 
solution is rotated using the varimax rotation criteria to enhance the interpretation of individual 
factors. Two measures of goodness-of-fit indicate that the factor solution is a reasonable, 
parsimonious representation of the underlying data. Compared to an ideal value of zero, the 

‘square root of the mean squared difference between predicted and actual correlations of the 
» variables with one another is .02, and the square root of the mean squared partial correlations for 
all variables with one another is .07. 

Factors are interpreted by examining the variables that weigh heavily in the factor solution 
and the patterns that emerge when factor scores of products produced during 1986-90 are plotted 
foreach plant (e.g., figure 3). The dominant factor, Factor 1, distinguishes differences in products' 
raw material content, specifically, whether the warp and fill threads are made of input 1 or 2, the 
specialities of Plants A and B, respectively. Factor 2 differentiates products on fabric weight. 
Fabric weight is correlated with the weight of warp and fill threads, the density of fabric 
construction, and the warp contraction that results from intertwining warp and fill threads. Factor 
3 distinguishes products on the basis of expected machine downtime. A critical source of 
downtime and off-quality fabric is thread breakage, or the “machine stop level.” expressed as 
breaks per 100,000 picks. Stop levels are predictable given the thread thickness, although 
breakage rates may be reduced by treating the warp threads (size pick-up) or slowing the machine. 
Factor 4 distinguishes products of different warp beam construction. The diameter of a full warp 
beamis limited by loom capacity. Given the permissible warp beam diameter, warp thread lengths 
are determined by the thread thickness and the number of threads on the beam. Warp thread length, 
machine speed and fabric density determine batch size and cycle time for a production run. 
Factors 5 and 7 distinguish differences in fill and warp thread constructions, respectively. Both 


* A test of whether a common factor model is reasonable is that partial correlations between pairs of variables controlling 
for all other variables are smaller than simple correlations between variables. A measure of sampling adequacy of .81 
(0S MSA £1) indicates that the common factor model is reasonable representation of product and process attributes and 
that the variables adequately define product attribute space (SAS User's Guide: Statistics 1985, 340). 
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TABLE 2 
Rotated Factor Pattern 


The results of using the maximum likelihood method of factor extraction and the varimax rotation 
criteria to identify independent sources of product variation from product engineering specifications. 





Variable Factor Factor Factor Factor Factor Factor Factor C® 
I 2 3 4 5 6 7 
Slash Stretch .95 —.09 24 10 —.05 — 0i .03 .87 
Input 2 Warp 95  -—09 ~26 08 -03 -02 04 67 
Input 2 Fill 68 -18  -18 -14  .532 -09 -07 98 
Reed Width —55 .05 19 .07 -.18 .26 —06 .88 
Input ] Full -.67 —22 —22 ~15 —.04 19 —11 55 
Input 1 Warp — 84 1 22 —32 —.08 .06 —01 .96 
Fill Weight —.08 85 23 24 —.13 — 15 .07 63 
Warp Contraction —.23 .78 — 15 .08 —235 —.Ii .07 .69 
Fabric Weight 1 .66 16 56 —.08 — 10 .05 .84 
Fill Finish —.06 52 .39 .08 —.40 —.16 .08 63 
Size Pickup — —.06 —.03 80 ~13 06 23 — 1f 69 
Machine Stop Level —14 25 72 .18 —.40 ~.01 ~13 81 
Rated Efficiency 14 — 14 —,76 ~.30 .26 .12 .20 88 
Warp Weight 23 26 = —05 91 0 -07 07 46 
Rated Machine Speed .18 —20 —12 —34 22 — 13 —-20 77 
Warp Length -1  -08 -12 -70  -05 02 «17 98 
Fill Denier 30  -12  -27 14 70 -10 ~06 73 
Input 3 Fill 18 42 -01 04 -66  -20 .08 31 
Defect Tolerance —22 —26 .16 —01 .08 .79 .06 .83 
Picks Per Inch —42 —.43 —29 —29 —07 52 —12 .81 
No. Warp Filaments 03 13 -4 19 -08 06 75 78 
Warp Denier 44 .04 ~17 1 —.08 —.19 58 .80 
% Common Product 
Variation Explained 279b 1796 1696 159b 11% 796 7% 
Name: Raw 
Material Fabric Expected Warp Fill Defect Warp 
Content Weight Downtime Beam Thread Tolerance Thread 
Squared Multiple 
Correlation of .98 .88 .90 .94 .85 .80 .80 
Variables with 
Factors 


©) C- Variable Communality 
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factors are influenced by thickness of fill and warp threads. In addition, Factor 5 depends on 
whether the fill is treated to have a luster and Factor 7 depends on how many warp threads are 
twisted together to form a single, stronger thread. Factor 6 distinguishes products based on defect 
tolerance specifications. Standards for acceptable quality depend on a customer's eventual use 
of the product. For example, fashion fabrics typically are produced to tighter tolerances than 
industrial fabrics. An artifact of the firm's product mix is that the highest quality standards are 
required by customers for high density fabrics? 

Figure 3 illustrates differences in the plants' product mixes during 1986-90 for two of the 
seven product attributes, raw material content (factor 1) and defect tolerance (factor 6).'? Each 
point on the graph represents the factor score of a single product that was produced during 1986- 
90. Consistent with the firm's factory focus strategy, the graph of raw material factor scores 
indicates that products produced in Plants A and B are tightly clustered in different segments of 
the raw material scale while Plant C's products span the scale.! However, as the graph of defect 
tolerance factor scores indicates, along some dimensions, the product mixes of Plants A and B are 
as diverse as the product mix of Plant C. The firm's use of raw material as the basis for facilities 
focus indicates that managers believe raw material variety is detrimental to performance. The 
question is whether raw material variety is the most costly form of PMH and whether increases 
in the remaining six forms of PMH are accompanied by increased MOHC. 


Independent Variables: Measures of Product Mix Heterogeneity and Excess Capacity 


There are three aspects of sequential and simultaneous PMH: 1) frequency of setups; ii) 
heterogeneity of products produced simultaneously; and, iii) heterogeneity of products produced 
in sequence. The two measures of setup frequency used are the number of major setups (DRAWS) 
and the number of minor setups (TIES). A combined measure of simultaneous and sequence- 
dependent PMH is constructed for each of the seven product attributes. For each of the seven 
orthogonal product attributes (j=1...7), a measure of heterogeneity, PMH, , is constructed as the 
standard deviation of factor scores, a, ,, of products (i=1...n), weighted by the number of machine 
hours, m ,,, consumed by product i, in period t. 


(2) 





? Factor analysis identifies systematic similarities and differences among products but leaves to the researcher the task 
of interpreting and linking the factor solution to meaningful constructs. Confidence in the interpretation of these factors 
is provided by: (1) the extent to which the factors mirror the four categories that were consistently used to describe woven 
fabrics (appendix A), (2) the extent to which relationships between different engineering specifications that were 
described—for example the technical relationship between machine stop levels and size pick-up —ermerge in the factor 
structure, and (3) the emergence of raw material as the dominant factor explaining product variation and the only factor 
that exhibits significant between-plant variations (figure 3). 

10Ractor 6 is representative of the remaining five factors. Factor 1, raw material content, is the only factor that distinguishes 
the product mixes of the three plants from one another. 

"From 1988 to 1990, Plant B produced a few high volume products made of Plant A's input specialty as part of an 
experimental program. This explains tbe products tnat fall in Plant A's segment of the raw material content scale. 

2 The firm records period output as it is shipped from the plant. It is not possible to unambiguously separate the sequence- 
dependent component of sequential PMH from simultaneous PMH because machine-level product records do not exist. 
Thus the set of factor scores of products shipped during a period reflect both simultaneous and sequential PMH. 
However, the time to complete a beam of fabric (two to four weeks) is long relative to a period; consequently, 
simultaneous PMH is the primary reason for shipping different products in a period. 
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FIGURE 3 
Factor Scores of Products Produced, 1986-90 


These figures plot the factor scores of products produced by the three plants from 1986 to 1990 for two of 
the seven factors found to describe product mix heterogeneity. Consistent with the firm’s focused factory 
strategy, the plants’ product mixes differ in raw material content. There is little difference between the plants 
in the defect tolerance of their product mixes. Like defect tolerance variety, graphs of the remaining five 
factors did not indicate significant differences between the plants. 
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The use of standard deviation to convert the matrix of factor scores of all products (7 x i) 
produced in a period to seven summary measures of PMH for the period is admittedly ad hoc." 
The factor scores are weighted by machine hours to control for the distribution of daily production 
over hundreds of looms.!* This approach associates each machine hour (and indirectly each labor 


13 One alternative that was explored was the range (maximum - minimum score) for each attribute each period. However, 
this measure was virtually constant for all periods for each plant because extreme products on each attribute scale were 
produced consistently. Consequently range offered little promise for explaining intertemporal changes in variable 
MOHC. 

4 Machine hours differ from linear output because fabrics are produced at different rates. The actual number of machine 
hours by product was unavailable. Conséquently, machine hours were calculated from actual yardage produced 
(including off-quality production), fabric density and rated machine speed. 
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hour) with a demand for producing a particular product attribute so that the PMH measure 
increases with heterogeneity of machine-level demands. If the factor scores of products produced 
in the month were not weighted, two plants that produced two products during a month would 
have identical PMH whether they devoted 99 percent of capacity to one product and one percent 
to the other or divided capacity equally between products. Cell manufacturing and focused factory 
theories predict that performance declines when employees divide their efforts among diverse 
activities. The weighting scheme captures the extent of this problem explicitly in measures of 
PMH. 

Figure 4 (a)-(g) plots the time series of PMH for the seven product attributes. Two insights 
emerge. Unlike the number of products produced, product mix heterogeneity did not increase 
steadily over the period; some forms of product mix heterogeneity changed very little while others 
decreased. Another insight concerns the relative product mix complexity of the three plants. 
While the number of products produced and the level of product mix change (table 1 and figure 
2) cause the plants to be ordered from most to least complex as C, B, A during 1986-90, figure 
4 indicates that the PMH ordering depends upon the product attribute considered and was rarely 
constant over the period for any single attribute. 

A measure of excess capacity, used to proxy for managers’ efforts to reduce resource 
endowments during sustained periods of reduced demand, is drawn from the plants’ historical 
records. The firm calculates excess capacity as: 


std machine hours — 
machine hours available 


where, standard machine hours include setup time and machine hours available equals the 
number of looms in place multiplied by the hours in the period (typically 24 hours x 28 days). 
Focusing as it does on machines, this proxy may be inadequate for assessing managers' efforts 
to reduce costs of other types of resources; however, existing records offered no alternatives. 


Dependent Variable: Manufacturing Overhead Costs (MOHC) 


The firm defines MOHC as all costs except direct material and material waste. Like most 
firms, it classifies MOHC as variable or fixed. Excluding corporate allocations, depreciation and 
extraordinary items, variable MOHC is approximately 80 percent of total MOHC. Fixed costs 
‘consist primarily of management compensation, management office supplies, dues, taxes and 
licenses. Analysis indicates that these do not vary over time with volume or mix of production and 
exhibit only a slight inflationary trend. The level of fixed costs is similar across the plants despite 
plant differences in PMH. Consequently, for the remainder of the analysis, MOHC refers to costs 
classified by the firms' accounting system as variable manufacturing overhead costs. 

Historical accounting records maintained by the centralized accounting group are the source 
of the MOHC data. MOHC is segmented by type and originating department in the accounting 
ledger, allowing exploration of the relation of PMH to components of MOHC. As table 3 
illustrates, the plants’ MOHCs are distributed similarly by type and consuming department. 


EXCESS = 1— (3) 


The practice of treating even “direct” labor as manufacturing overhead is common among firms with advanced 
manufacturing technologies or high-commitment personnel strategies (Berlant et al. 1990). Material waste is primarily 
engineered waste rather than flawed fabric. It is resold as rags or bulk waste twice pez year. Defective fabrics are sold 
at a markdown to external fabric finishers. 
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TABLE 3 


Variable Manufacturing Overhead Costs (MOHC): A Matrix Perspective of Accounting 
Ledgers and Functional Departments for Plants A, B, and C, 1986—90 


Technical % of Total 
Weaving Inspection Support MOHC 
Functional Dept.: Department Department Department Administration by Ledger 
Accounting Ledger: . l 
Operating Labor: x 
Straight Time Wage X X À: 3796 
Overtime Premia X X B: 44% 
Shift Premia X X C: 4696 
Power X A: 34% 
B: 28% 
C: 27% 
Operating Supplies X X x A: 14% 
B: 11% 
C: 10% 
Overhead Labor: 
Straight Time Wages X X X A: 10% 
Overtime Premia X X X B: 10% 
Shift Premia X X X C: 10% 
Other/Miscellaneous - A: 5% 
B: 7% 
C: 7% 
Percent of Total A: 45% A: 6% A: 10% A: 38% 
MORC, by B: 36% B: 1096 B: 11% B: 43% 
Department C: 42% C: 1196 C: 13% C: 34% 


V. ECONOMETRIC MODELING AND STATISTICAL RESULTS 


Time Series Properties of MOHC and Product Mix Heterogeneity 


Two difficulties in establishing the relation between variable MOHC and PMH using time 
series data are the likelihood of nonstationarity and persistence of the series. In the case of 
stationarity, both series are expected to exhibit an upward trend over the five year period as a result 
of factors such as product proliferation and inflation. Persistence in MOHC is an artifact of 
acquiring resources in discrete, large quantities, of purchasing contracts that span production 
periods, and of managers' inability or unwillingness to reduce resources in periods of reduced 
demand. Persistence in PMH arises because production of a batch typically spans two periods and 
because demand for products typically reflects trends or fashions that persist for several periods. 
Nonstationarity and persistence increase the probability that regressions of MOHC and PMH will 
document spurious correlations between the variables (Harvey 1981; McCleary and Hay 1981). 
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To counter this possibility, the dependent and independent variables are first subjected to 
univariate time series modeling to remove variation that is predictable given the historical pattern 
of the variable itself (Box and Jenkins 1976; Fuller 1976). Then innovations in PMH—the 
residual variation of the time series models—are regressed against innovations in MOHC to 
estimate the relation between PMH and MOHC. Modeling PMH, excess capacity, setups, and 
MORC as output of ARIMA processes, most of the variables are well represented as first order 
autoregressive processes. Small (insignificant) values of the Q-statistic for residuals of the 
specified ARIMA model indicate that sources of variation arising from predictable patterns in the 
variable itself are removed.!6 


Product Mix Heterogeneity and Total Variable Manufacturing Overhead Costs 


This section reports the results of regressing innovations in variable MOHC on innovations ' 
in excess capacity, setups and PMH using the method of seemingly unrelated regressions." 
Because PMH measures are based on shipment records and more than half of products shipped 
in a period are produced in the preceding period, the model is estimated using measures of PMH 
for products shipped in the period immediately following the period in which the cost appears in 
the accounting ledger.!? The results of regressing MOHC on excess capacity, major and minor 
setups, and the seven forms of PMH are presented in table 4. 

If managers reduce resources during periods of reduced demand, we expect MORC to 
decrease with excess capacity. The results for Plant A are consistent with this prediction; 
however, neither Plant B nor Plant C exhibit this pattern of reduced costs. Three explanations are 
suggested. First, managers may be reluctant to remove resources from the plant during periods 
of reduced demand; perhaps because they face greater PMH brought about by less discriminating 
sales practices during market downturns or because they predict that the downturn is temporary. 
Second, insignificant variation in capacity utilization may limit the power of the test. Finally, 
machine capacity may be an inadequate proxy for utilization of overhead resources. That Plant 
A, the only plant with several sustained periods of relatively high excess capacjty (40 percent), 
is the only plant for which excess capacity is related to MOHC supports the first two explanations. 
That Plants B and C had 20 percent excess capacity several times during the period—half that 
experienced by Plant A—suggests the first explanation because realized variation in capacity 
utilization seems great enough to permit estimation of an effect on MOHC if one exists. If the first 
interpretation, of managerial inaction, is correct, it implies that periodically Plants B and C had 
resources that could be redeployed to mitigare the predicted effect of PMH on MOHC. 

Turning to the relation between MOHC and one aspect of sequential PMH—setup fre- 
quency—we expect setups to be related to increased MOHC, with major setups corresponding 
to higher costs than minor setups. Plant B exhibits the expected relation of major setup costs 


“In order to distinguish whether a variable is more accurately modeled as a random walk process or a first-order 
autoregressive process with a large value of p, the following equation is estimated: 


(E -Y Je (p — DY, t-1 +a, 


The t-statistic of the Y, , coefficient is used to test whether p=1. However, because the t-statistic obtained under the null 
hypothesis is not asymptotically normally distributed, modified critical t-values (t-values) tabulated by Schmidt (1990) are 
used. This class of tests for a unit root are known as Dickey-Foller tests (see Kennedy 1992). Results of the time series models 
are available from the author by request. 

P Since the plants produce similar products for similar markets it is likely that the vectors of disturbances of independently 
estimated OLS regressions are correlated between the plants (Kmenta 1986, 637). The method of seemingly unrelated 
regressions exploits this linkage to increase the efficiency of the coefficient estimates. 

18 The cross-correlation plots of MOHC and the PMH variables show no indication of lags in the relationship between 
PMH and MOHC. 
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TABLE 4 


The Relation between Product Mix Heterogeneity and Manufacturing Overhead Costs: 
A regression of innovations in product mix heterogeneity on innovations in total variable 
manufacturing overhead cost using the method of seemingly unrelated regressions (SUR) for 


N=63, Absolute t-statistics in parentheses 


**—significant at 1%, one-tail *=significant at 5%, one-tail 





the period, 1986-90. 
PLANT A: PLANT B: PLANT C: 
MORC MOHC MOHC 
Constant 304 -2351 —284 
(.15) (.62) (.10) 
Excess Capacity —129089 —73049 2739 
(4.09)** (77) (.04) 
Major Setup 170 973 78.2 
(.72) (2.91)** (.37) 
Minor Setup 167 228 240 
(3.04)** (3.47)** (5.37)** 
Product Mix Heterogeneity 
Raw Material NA 11083 —19456 
FACTOR 1 (.78) (.39) 
"Fabric Weight 19780 154187 -2015 
FACTOR 2 (.23) (1.34) (.04) 
Expected Downtime 173266 —80236 105346 
FACTOR 3 (1.91)* (.37) (2.85)** 
Warp Beam —15054 —149096 7499 
FACTOR 4 (.38) (1.63) (1.29) 
Fill Thread —38526 —306468 19727 
FACTOR 5 (.43) (2.22)* (.38) 
Defect Tolerance 184250 155122 ~114781 
FACTOR 6 (3.83)** (1.02) (1.18) 
Warp Thread —30356 —223640 —105660 
FACTOR 7 (.61) (1,22) — (1.58) 
OLS Single Eqn Adj R? .39 .17 .33 
OLS Single Eqn F-stat 9:99 23 4.0* 
Results Without Pre-whitening Variables using ARIMA Models 
OLS Single Eqn Adj R2 0.57 0.56 0.80 
OLS Single Eqn F-stat 10.1 9.1 26.9 
Significant PMH variables 
(Factor # above) 4,6 all 3,3,7 
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greatly exceeding minor setup costs. In contrast, MOHC in Plants A and C are increased by minor 
setups, but indicate no discernable cost of major setups. To explore the source of this puzzling 
result, separate models (not reported) were estimated for each type of overhead cost (table 3) 
included in total variable manufacturing overhead, and the underlying cause of the costs of major 
setups not exceeding the costs of minor setups became apparent. As expected, costs of major 
setups exceed costs of minor setups for virtually every ledger category in all three plants. The 
exception is Power costs. Power costs decrease with major setups because machines are 
physically turned off for the lengthy process of a major setup, but are merely idled for a minor 
setup. This instance of minor setup costs exceeding major setup costs offsets the more common 
effect of sequential PMH and explains the anomalous results for total variable MOHC of Plants 
A and C.? 

If the power "savings" of major setups is excluded from total variable manufacturing 
overhead costs, the average cost of major and minor setups for the three plants are: 


Major Setup | Minor Setup | Difference 


Plant A. $515 $177 $338 
Plant B $345 $130 $215 
Plant C $264 $193 $ 71 


An interesting pattern emerges: Plant C, which has the greatest experience doing major setups, 
has the lowest cost of major setups; Plant B, which has the greatest experience doing minor setups, 
has the lowest cost of minor setups. These results are corroborated by the company’s independent 
assessment (used in internal decisions related to scheduling and machine loading) that the 
difference in major and minor setup costs is about $250. Evidence that costs of setups decrease 
with experience suggests variety-based learning—that costs associated with sequential PMH are 
mitigated by experience changing between heterogeneous products. 

Turning to measures of simultaneous PMH, table 4 indicates that two of the seven forms of 
PMH, expected downtime and customer defect tolerance variety, are significantly related to 
increased MOHC in Plant A.” Variety in expected downtime is also correlated with MOHC in 
Plant C, although a given change in downtime heterogeneity appears to be less costly in Plant C 
than in Plant A. Variety in fill thread construction reduces MOHC of Plant B. The strength of the 
result for Plant B, which had a unique strategy of using new combinations of existing warp beams 
with a variety of fill threads as a vehicle for product proliferation (Section I), suggests that this 
strategy is rewarded with lower MOHC.?^ If bobbins of fill thread are exchanged mid-way 


Tn Plant B the estimated model for power costs had very low explanatory power with no significant variables. Further 
investigation indicated that the difference between the plants in the ability to estimate a model of power costs was caused 
by Plant B being located in a different state with different utility supplier billing practices from Plants A and C. Utility 
costs of Plant B were billed using a cost averaging scheme that was related to annual demand rather than monthly 


demand. 

? One possibility is that defect tolerance variety is proxying for changes in the average defect tolerance of the product 
mix-—-that the plant is being asked to produce products that are individually more difficult rather than collectively more 
difficult. Investigating this, I found no instance where, when it was included in the models, average defect tolerance 
supplanted defect tolerance variety as an appropriate explanatory variable. I thank James Patell for bringing this to my 
attention. 


?! Although this effect is not evident in the estimation of total MOHC for Plants A and C, when separate models were 
eee (Or cec De OL OV ee COSI eet variety War signin antiy Telarc owen Overtime wages Tor 
operating labor and lower regular and overtime wages for indirect (c.g., setup technicians) labor. 
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through a beam of warp threads, a third relatively inexpensive form of setup emerges, and smaller 
minimum fabric batch size (heretofore the length of the warp beam) becomes economical. 

Notably absent is a significant impact of raw material heterogeneity—the factor believed by 
the firm to be most costly—on Plant C, which experienced the highest level of raw material 
heterogeneity. One explanation might be that costs to support the capability of producing a broad 
range of raw materials are not included in variable MOHC, the subject of this study; but rather, 
are classified as fixed MOHC. However, as Sections III and IV argued, there is no evidence that 
Plant C's resources differ from those of Plants À or B. Alternatively, costs of raw material 
heterogeneity may be so persistent (because managers are reluctant to reduce this "core" 
capability during brief downturns in raw material heterogeneity) that they are removed with 
ARIMA modeling. Toinvestigate this possibility the models were re-estimated with untransformed 
variables. The results are summarized at the bottom of table 4. Although substantial contempo- 
raneous correlation between PMH and MOHC is removed by time series modeling, even without 
taking these steps to preclude spurious correlation, raw material heterogeneity is not correlated 
with Plant C's MOHC. 

Because of its role in the firm's focused factory strategy, Plant C was the only plant that was 
expected to experience raw material variety. However, as figures 3 and 4 indicate, Plant B 
experienced an increase in raw material variety in 1988 as part of an experimental program in 
which a high volume product made of Plant A's specialty input was assigned to Plant B. Although 
table 4 suggests that this experiment was not costly, when the relation between PMH and MOHC 
was estimated separately for each type of overhead (not reported), raw material variety was 
significantly related to increases in wages of operating labor and in overtime payments to both 
operating and indirect labor in Plant B. In these disaggregate models, raw material variety 
continues to play no role in explaining costs in Plant C. These results provide further evidence on 
the role that learning plays in mitigating costs of PMH. The form of PMH thought by the firm to 
be most costly, raw material variety, has no impact on the plant that, by design, faced the highest 
level of raw material variety. In contrast, Plant B, which experienced relatively little raw material 
variety, suffered increases in some types of MOHC. Together with the evidence that costs of 
major and minor setups decrease with experience, these results suggest that costs associated with 
sequential and simultaneous PMH may be mitigated through experience producing a heteroge- 
neous mix of products. 

One contribution of this research is identifying a process for developing measures of PMH 
that provide improved estimation of MOHC and greater understanding of the drivers of MOHC 
than traditional proxies. By capturing similarities and differences among products, measures of 
PMH are better linked than simpler proxies to the underlying theories of economies of scope, 
focused factories, and activity-based costing that first motivated estimation of MOHC. Table 5 
presents evidence on improved estimation of MOHC. The model is re-estimated substituting the 
number of unique warp-fill combinations produced (from table 1, panel B) for the attribute-based 
measures of PMH used in table 4. Number of products produced is significantly correlated with 
MOHC in only one case, Plant A. The attribute-based measures of PMH offer greater overall 
explanatory power than the simpler proxy for Plants A and C and perform as well for Plant B. 
When the same model was estimated separately for each type of cost that comprises MOHC (not 
reported), the "number of products produced" was not significantly correlated with any type of 
variable overhead cost. In sum, this research demonstrates that, at least in some environments, 
attribute-based measures of PMH achieve their objective of providing improved estimation and 
greater understanding of MOHC and its drivers. 
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TABLE 5 


The Relation Between Number of Products Produced and Manufacturing 
Overhead Costs 


A regression of innovations in the number of products produced on innovations in total variable manufac- 
turing overhead cost using the method of seemingly unrelated regressions (SUR) for the period, 1986-90. 


Plant A Plant B Plant C 
MOHC MOHC MOHC 
Constant 183.1 -352.5 -454.9 
(.08) (.09) C15) 
Excess capacity —184563 -91595 12577 
(5.54)** (.90) (.20) 
Major Setup ~20.1 747.8 -54.5 
(.08) (2.20)* (.27) 
Minor Setup 115.5 190.0 235.0 
(1.88)* (2.92)** (5.72)** 
Number of Products = 957.7 ~772.0 -359.5 
number of unique warp (1.68)* (.97) (.75) 
fill combinations produced 
OLS Single Eqn Adj R? 23 .17 .28 
N=63, Absolute t-statistics in parenthesis 
**—sjgnificant at 1%, one-tailed *=significant at 5%, one-tailed 
VI. SUMMARY 


This paper uses an attribute-based model of product mix heterogeneity (PMH) to examine the 
relationship between variable manufacturing overhead costs (MOHC) and PMH in three textile 
weaving plants of a single firm. PMH takes two forms: sequential production of different products 
on a machine and simultaneous production of different products on many parallel machines. The 
fixed effects of sequential PMH are measured by the number of major and minor setups. A 
combined measure of sequence-dependent and simultaneous PMH is developed from a frame- 
work popularized by the group technology literature of operations management. Factor analysis 
is used to reduce a large set of engineering specifications to a compact set of seven independent 
product attributes, upon which measures of PMH are based. Regression analysis is used to 
estimate the relation between MOHC and PMH variables and to compare the explanatory power 
of attribute-based measures to a traditional proxy for PMH, the number of products produced. 

The results indicate that increased MOHC is associated with increases in the number and 
severity of setups and increased heterogeneity in process specifications (expected downtime) and 
quality standards (defect tolerance heterogeneity) of a plant’s product mix. Evidence that the cost 
of PMH declines with experience producing a heterogeneous product mix is found in two 
relationships. First, estimated differences in the cost of major and minor setups mirror the plants’ 
relative experience performing setups of each type. Second, evidence that raw material variety 
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increased direct and indirect labor costs for Plant B, a focused plant, but had no impact on costs 
of Plant C, the “swing” plant, suggests that experience producing fabrics from a variety of raw 
materials may decrease the costs of raw material variety. Compared with the number of products 
produced, attribute-based measures better estimate the relation between PMH and MOHC and 
yield increased understanding of factors that drive MOHC. 

Management accountants’ role in supporting manufacturing's effort to become more 
responsive includes assessing the impact of PMH on plant performance, identifying channels 
through which variety undermines performance, and devising yardsticks for evaluating progress 
in achieving manufacturing flexibility. In light of the small sample of this study, an obvious 
opportunity for future studies is replication. Group technology is in widespread use in a number 
of manufacturing settings, so the methods used to develop measures of PMH may be generalizable 
as well. A second opportunity for extending this work lies in considering other sources of 
information to describe products. Although the basis for identifying product similarities and 
differences for this paper was engineering specifications, another possibility for discriminating 
among products is on the basis of resource usage. In particular, the second stage cost drivers used 
in activity-based costing to relate overhead costs to transactions caused by products could be 
useful for identifying underlying PMH. Because machine-level production data was not avail- 
able, this research used simple measures of sequential PMH—the number of major and minor 
setups. Research using sequential PMH measures that are sensitive to similarities and differences 
in products would provide better estimates of the relative costs of sequential and simultaneous 
PMH and would contribute evidence on qualitative differences between sequential and simulta- 
neous PMH, Finally, this study has provided preliminary evidence that variety-based learning 
arises with experience producing diverse products. If true, this finding has the potential to reverse 
the factory focus strategy; suggesting instead strategies for accelerating variety-based learning to 
achieve complete manufacturing flexibility. Future studies must document the dynamic relation- 
ship between cost and experience producing heterogenous products and identify the types of 
PMH that are subject to learning. Extensions such as these would advance the development of 
processes by which management accountants can identify the determinants of an organization's 


. costs. 


APPENDIX A: PRODUCT MIX HETEROGENEITY OF WOVEN FABRICS 


Heterogeneity of woven fabrics stems from unlimited combinations of different threads. In 
order to determine product attributes that contribute to PMH and increased MOHC and to identify 
relationships between product attributes, I conducted interviews with process and industrial 
engineers at each plant and at the Division. Although over 30 engineering parameters were 
mentioned, they can be loosely organized into four categories: filament construction, warp beam 
construction, fabric construction and the product-process interface. 

Filament construction parameters describe the warp and fill threads that comprise a fabric. 
. One aspect of filament construction is fiber specifications: fiber content, weight, and filament 
treatments. Fiber content refers to the raw fibers used (i.e. rayon, polyester). "Denier" is the 
industry's measure of weight per unit length. Chemicals used in the extrusion of manmade 
filaments may further differentiate filaments of the same fiber, changing their luster or shine, or 
imparting color. A second aspect of filament construction reflects upstream textile processes: 
spinning, twisting, dyeing and texturing. 

The above discussion applies to warp and fill threads. Warp threads require additional 
description of the warp beam construction. The denier of the warp thread and the width of the warp 
beam indicates the density of the warp threads on the beam and the maximum width of the finished 
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fabric. A related parameter is the length of each thread on the warp beam. Warp length determines 
the duration a loom runs before requiring setup. Warp threads must be stronger than fill threads 
because they are under constant tension on the loom and are abraded with each pick. Consequently 
a chemical, called “size,” is applied to warp threads during warp beam construction. There are 
two size-related specifications: size take-up, the absorption of size as a percent of warp weight, 
and slasher stretch, the extent to which warp contraction caused by sizing is reversed in weaving. 

The third category of product specifications, fabric construction, describes the way that warp 
and fill threads are combined. The weave pattern describes designs on the fabric face. Common 
weave patterns are: satin, twill, and herringbone. Complex patterns are produced at slower 
machine speeds. Another aspect of fabric construction is the dimension of the finished product. 
Though largely governed by warp construction and raw material content, product dimensions are 
also determined by fabric density (picks per inch)—how tightly the threads are packed together, 
and by warp contraction—the percent of warp length lost in weaving as a result of inserting picks. 

The most important feature of fabric, its uniformity, is largely a function of machine settings, 
or process specifications. The primary cause of fabric defects is loom stoppage. Process 
interruptions leave a perceptible line across the fabric. Loom stops occur with thread breakage, 
preventive maintenance or machine failures and are influenced by machine speeds, raw material 
uniformity and fabric construction complexity. Machine speeds are set to the fastest rate 
consistent with quality requirements of the customer. The machine speed chosen implies an 
expected operating efficiency, stop level, and quantity of off-quality production. The following 
table summarizes the engineering specifications: 


Variable Variable Measure 

I. Raw Material 

1. Fiber content Binary variable, 0—1 for each of 3 input types 

2. Denier or count Weight per unit | 

. 3, Finish Categorical variable, 1-5: 1-:bright, 4=dull, 5=unfinished 

* 4. Dye Binary variable, 1=dyed 
* — 5. Texture Binary variable, 1=textured 
*  6.Twist Multiple Twists per unit length 


7. Number of Filaments Number of thread twisted to form one 


1. Warp Length Length of a warp thread 
2. Slasher Stretch Percent warp length increase in weaving 
3. Size Take-up Percent warp weight increase in sizing 
4. Reed Width Fabric width 
IIl. Fabric Construction 
* 1. Type Weave Fabric pattern 
2. Picks per Inch Fabric density 
3. Fabric Weight Fabric weight/linear yd. 
4. Warp Contraction Percent warp length reduction in weaving 
5. Filament Weight Warp wala fill weight/linear fabric yd 
IV. Product-Process Interface 
1, Picks per Minute Machine speed 


2. Machine Stop Level Thread breakage rate 
3. Expected Efficiency Run time as a percent of machine throughput time (excludes setup time) 
4. Defect Tolerance Categorical variable 1-5; 1=wide tolerance range, 5-natrow tolerance range 


* Although these variables were mentioned in several interviews, upon further investigation they related to fewer than ten 
percent of the approximately 600 products produced during 1986-90. Rummel (1970) advises against including 
categorical variables that are relevant for fewer than ten percent of the observations in a factor solution. Thus these forms 
of PMH are excluded from the factor solution. 





386 The Accounting Review, July 1995 


REFERENCES 

Adler, P. S. 1988. Managing flexible automation. California Management Review 30 (3): 34—56. 

Banker, R. D., S. Datar, and S. Kekre. 1988. Relevant costs, congestion and stochasticity in production 
environments. Journal of Accounting and Economics 10: 171—198. 

——, ———— , and T. Mukhopadyay. 1990. Costs of product and process complexity. In 
Measures for Manufacturing Excellence, edited by R. S. Kaplan. Boston, MA: Harvard Business 
School Press. | 

, G. Potter, and R. G. Schroeder. 1992. An empirical analysis of manufacturing overhead cost drivers. 

Working paper, University of Minnesota, Minneapolis, MN. 

, and H. Johnston. 1993. An empirical study of cost drivers in the U.S. airline industry. The 
Accounting Review 68 (3): 576—601. 

Baumol, W. J., and Y. M. Braunstein. 1977. Empirical study of scale economies and production 
complementarity: The case of journal publication. Journal of Political Economy 85 (5): 1037—1049. 

Berlant, D., R. Browning, and G. Foster. 1990. How Hewlett-Packard gets numbers it can trust. Harvard 
Business Review (Jan-Feb): 178—180. 

Bitran, G. R., and S. M. Gilbert. 1990. Sequencing production on parallel machines with two magnitudes of 
sequence-dependent setup cost. Journal of Manufacturing Operations Management 3: 24—52. 

Box, G. E. P., and G. M. Jenkins. 1976. Time Series Analysis: Forecasting and Control. San Francisco, CA: ` 
Holden-Day. 

Burbidge, J. L. 1989. Production Flow Analysis for Planning Group Technology. Oxford, England: Oxford 
University Press. 

Cooper, R., and R. S. Kaplan. 1987. How cost accounting systematically distorts product costs. In 

— Accounting and Management: Field Study Perspectives, edited by W. H. Bruns and R. S. Kaplan. 
Boston, MA: Harvard Business School Press. 

Cooper, W. W., K. K. Sinha, and R. S. Sullivan. 1991. Accounting for complexity in costing high-technology 
manufacturing. Working paper, University of Texas, Austin, TX. 

Datar, S., S. Kekre, T. Mukhopadhyay, and K. Srinivasan. 1993. Simultaneous estimation of cost drivers. 
The Accounting Review 68 (3): 602-614. 

DeMeyer, A., J. Nakane, J. Miller, and K. Ferdows. 1989. Flexibility: The next competitive battle, The 
Manufacturing Futures Survey. Strategic Management Journal 10: 135—144. 

Foster, G., and M. Gupta. 1990. Manufacturing overhead cost driver analysis. Journal of Accounting and 
Economics 12: 309-337. 

Fuller, W. A. 1976. Introduction to Statistical Time Series. New York, NY: John Wiley and Sons. 

Gorman, I. 1985. Conditions for economies of scope in the presence of fixed costs. The Rand Journal of 
Economics 16: 431—436. 

Gupta, Y. P., and S. Goyal. 1989. Flexibility of manufacturing systems: Concepts and measurements. 
European Journal of Operational Research 43: 119—135. 

Harmon, H. 1976. Modern Factor Analysis. Chicago, IL: University of Chicago Press. 

Harvey, A. C. 1981. The Econometric Analysis of Time Series. Oxford, England: Philip Allan Publishers, 

Hayes, R., and S. Wheelwright. 1984. Restoring our Competitive Edge. New York, NY: John Wiley & Sons. 

Hill, T. 1985. Manufacturing Strategy. London, England: MacMillan Education, Ltd. 

Hyer, N., and V. Wemmerlov. 1984. Group technology and productivity. Harvard Business Review (July/ 
Aug): 140-149. 

Johnson, H. T., and R. S. Kaplan. 1987. Relevance Lost. Boston, MA: Harvard Business School Press. 

Karmarkar, U., and S. Kekre. 1987. Manufacturing configuration, capacity and mix decisions considering 
operational cost. Journal of Manufacturing Systems 6 (4): 315—324. 

Kekre, S., and K. Srinivasan. 1990. Broader product line: A necessity to achieve success? Management 
Science 36 (10):1216—1231. 

Kennedy, P. 1992. A Guide to Econometrics. Cambridge, MA: MIT Press. 

Kinney, W. 1986. Empirical Accounting Research Design for Ph.D. Students. The Accounting Review 61(2): 
338-350. 











Anderson——Measuring the Impact of Product Mix Heterogeneity — 387 


Kmenta, J. 1986. Elements of Econometrics. New York, NY: MacMillan Publishing Co. 

Marschak, T., and R. Nelson. 1962. Flexibility, uncertainty, and economic theory. Metroeconomica 
14: 42-58. 

McCleary, R., and R. A. Hay. 1981. Applied Time Series Analysis for the Social Sciences. Beverly Hills, CA: 
. Sage Publications. 

Miller, J. G., and T. E. Vollmann. 1985. The hidden factory. Harvard Business Review (Sept-Oct) 63: 
142-150. 

Panzar, J. C., and R. D. Willig. 1977. Economies of scale in multi-output production. Charter Journal or 
Economics 91:481—493. 

,and . 1981. Economies of scope. American Economic Review 71 (2): 268-272. 

Pulley, L. B., and Y. M. Braunstein. 1992. A composite cost function for multiproduct firms with an 
application to economies of scope in banking. Review of Economics and Statistics 74 (2): 221—230. 

Roll, Y., R. Karni, and Y. Arzi. 1992, Measurement of processing flexibility in flexible manufacturing cells. 
Journal of Manufacturing Systems 11 (4): 258-268. | 

Rummel, R. J. 1970. Applied Factor Analysis. Evanston, IL: Northwestern University Press. 

SAS Institute Inc. 1985. SAS User's Guide: Statistics, 5th ed. Cary, NC: SAS Institute Inc. 

Schmidt, P. 1990. Dickey-Fuller tests with drift. Advances in Econometrics 8: 161—200. 

Sethi, A. K., and S. P. Sethi. 1990. Flexibility in manufacturing: À survey. i di cdd 
Flexible Manufacturing Systems 2 (4): 289-328. - 

Skinner, W. 1974. The focused factory. Harvard Business Review 52 (3): 113-121. 

Slack, N. 1987. Flexibility of manufacturing systems. International Journal of Operations and Production 
Management 7 (4): 35-45. 

Son, Y. K., and C. S. Park. 1987. Economic measure of productivity, quality and flexibility in advanced. 
manufacturing systems. Journal of Manufacturing Systems 6 (3): 193—206. 

Stigler, G. 1939. Production and distribution in the short run. Journal of Political Economy 47 (3): 305—327. 

Teece, D. 1980. Economies of scope and the scope of the enterprise. Journal of Economic Behavior and 
Organization 1: 223—247. 

Wilson, R. C., and R. A. Henry. 1977. Introduction to Group Technology ATE and Engineer- 
ing. Ann Arbor, MI: University of Michigan. - 








THE ACCOUNTING REVIEW 
Vol 70, No. 3 
1995 


pp. 389415 


Discretion vs. Uniformity: 
Choices Among GAAP 


Ronald A. Dye 
Northwestern University 
Robert E. Verrecchia 

University of Pennsylvania 


ABSTRACT: We reexamine the "uniformity vs. flexibility" debate by considering the 
consequences of varying the amount of discretion managers have in reporting 
current period expenses. We study the effects of altering GAAP on both the "internal 
agency problem" between current shareholders and their manager and the “extemal 
agency problem" between current and prospective shareholders. We show that the 
internal agency problem is ameliorated by expanding discretion, and we exhibit 
examples where expanding discretion is undesirable when the Internal and external 
agency problems are present concurrently and the nonexpense-related compo- 
nents of eamings are measured with error. We establish that expanding discretion 
is welfare-enhancing if either the manager's contract is publicly observable or else 
the nonexpense-related components of earnings are measured without error. These 
results demonstrate the limitations of evaluating GAAP on a piecemeal, or issue-by- 
issue, basis. 


Key Words: Flexibility, GAAP, Discretion, Agency 


I. INTRODUCTION 


he debate on "uniformity vs. flexibility" has a rich history, but no resolution.! We focus 

| on one version of this debate and ask: what are the incentive effects, the allocational 
effects, and the distributional effects of increasing the amount of discretion in GAAP? We 

study a problem in which a firm's current period activities create expenses that are not realized 


! See, c.g., Dopuch and Pincus (1984), Sunder (1983), the references in Zeff and Keller (1985), and Wolk et al. (1992). 
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until future periods (e.g., future service costs, bad debt expenses, returns, etc., associated with 
current sales), and there is a question as to how much of these future expenses should be 
recognized currently. We consider two types of GAAP, “rigid uniformity” and “complete 
discretion.” When GAAPis rigid, all firms must use the same accounting procedure, e.g., all R&D 
must be expensed. In contrast, when GAAP allows complete discretion, every economically 
feasible expensing procedure is allowed. When GAAP displays this rich array of accounting 
procedures, earnings of different firms in principle may be more "comparable" than when GAAP 
is rigid. If every firm selects an expénsing procedure appropriate to its circumstances, a dollar of 
booked earnings equals a dollar of economic earnings for all firms, resulting in “comparable” 
accounting earnings across firms. 

The presence of agency problems may thwart these potential benefits of discretion in GAAP 
if self-interested managers select accounting procedures that make themselves, or their firms, 
look successful, regardless of their actual economic performance. Thus, whether expanding 
discretion in accounting choice is desirable appears to depend on whether the prospects for 
improved communication of the firm’s financial condition are more than offset by the effects of 
managerial opportunism. ? 

These general observations about the tradeoffs between complete discretion and rigid 
uniformity fail to consider whether the agency problems arising from financial reports are related 
solely to interactions between a firm's current shareholders and their manager (“the internal 
agency problem") or are due to a combination of these interactions,and the agency problem 
between current and prospective shareholders ("the external agency problem"). Such distinctions 
turn out to be central to any comparison of the relative desirability of rigid uniformity or complete 
discretion. For example, we show that if attention is confined to the internal agency problem, then 
moral hazard involving the manager's actions is always ameliorated by replacing rigid GAAP by 
discretionary GAAP. In contrast, we show that, when both internal and external agency problems 
are present concurrently, discretionary GAAP can be inferior to rigid GAAP. 

To understand these results, compare the information obtained about a firm's actual realized 
expenses by observing the firm select among accounting procedures, first, when GAAP allows 
discretion and, second, when GAAP is rigid. Since the latter "selection" process is degenerate 
(there is only one accounting procedure in conformity with rigid GAAP), it follows that 
discretionary GAAP is always more informative than rigid GAAP, even though discretionary. 
GAAP creates opportunities for managerial intervention into the expense recognition process. 
This extra information generated about the firm under discretionary GAAP reduces the internal 
agency problem between the current shareholders and their manager: it decreases the current 
shareholders' expected cost of getting the manager to adopt a particular action, and expands the 
set of actions thev can induce the manager to implement. But, itis precisely this latter effect that 
creates the possibility that discretionary GAAP will exacerbate the external agency problem 
between current and prospective shareholders. This follows because in any agency setting, the 


: Some prominent commentators in the accounting profession have definite opinions about the relative advantages and 
disadvantages of additional discretion in reporting choice. For example, Spacek (1961, 43) opines: "The objection has 
E a bau iia dps eliminate flexibility in accounting principles. But to my knowledge, not one 
person has attempted to show where flexibility in the choice of alternative principles of accounting would result in 
financial statements that were fair to all segments of the business community. The arguments were only that flexibility 
was good, per se, and that the elimination of flexibility was bad, per se. Yet with respect to no single set of facts to be 
accounted for was the theory of flexibility applied and reasoning advanced to show why the ‘flexible’ results were proper 
or fair....Flexibility, as such, has not brought improvement; in fact, [with flexibility] the less desirable practices have 
tended to drive out, or at least retard, acceptance of the good." 
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cost of getting an agent to select a particular action increases as the set of actions available to the 
agent expands. Since discretionary GAAP expands the set of actions the current shareholders can 
get their manager to implement, the contracting problem between generations of shareholders can 
get worse. This latter effect can be so pronounced that, in at least some examples, ngid GAAP 
results in a higher expected selling price for the firm, higher welfare for the current shareholders, 
and better investment decisions by the prospective shareholders than discretionary GAAP. 

It seems intuitive, however, that the benefits of discretionary GAAP would generally 
outweigh its costs in “reasonable” cases, because—besides improving the internal agency 
problem—the information provided under discretionary GAAP will permit the prospective 
shareholders to better “fine-tune” the investments they make in the firm. In searching for 
conditions under which this intuition is valid, we must consider how earnings are constructed. 
Economic earnings are defined to be “gross” earnings reduced by applicable expenses. While we 
allow expenses reported under GAAP to be “soft” (in that the firm’s manager sometimes can 
influence whether reported expenses coincide with realized expenses), the gross earnings 
numbers are presumed to be “hard.” Thatis, though we allow for the possibility that reported gross 
earnings may measure true gross earnings with error, the manager is presumed to be incapable 
of altering the relation between realized and reported gross earnings.’ 

When gross earnings are measured with error, rigid GAAP may be superior to discretionary 
GAAP. But, we show that if GAAP-based gross earnings measure actual gross earnings without 
error, then discretionary GAAP is always strictly superior to rigid GAAP independent of other 
problem-specifics. We conclude that the desirability of giving a manager flexibility in how he 
reports one dimension of his performance depends upon the absence of errors in the measurement 
of other dimensions of his performance, and more generally, that the desirability of a change in 
one dimension of GAAP depends upon how other dimensions of GAAP are specified. While 
Bromwich (1980) established that restrictive conditions have to be satisfied in order for an 
individual’s measured expected utility to be independent of whether standards are evaluated on 
a piecemeal (or "partial") basis, this is—to our knowledge—the first formal demonstration that 
a piecemeal approach to evaluating standards can lead to welfare reductions. 

Another condition that gives rise to the general superiority of discretionary GAAP consists 
of making the contract between the current shareholders and their manager observable to 
prospective shareholders. The intuition behind this result, as well as a more comprehensive list 
of conditions under which discretionary GAAP is superior to rigid GAAP, is discussed in the text 
below. 

The analysis proceeds by discussing the internal and external agency problems separately, 
and then combining these two into a “global” problem. The solution to the internal agency 
problem, involving the contract between the current shareholders and their manager, is discussed 
next in section II. Section III studies the external agency problem that arises when the current 
shareholders try to influence prospective shareholders’ perceptions of the firm's value through 
the judicious choice of expensing procedures. Section IV combines the first two sections, and 
characterizes equilibria for the global problem. Section V compares the two GAAP regimes; it 
contains an extensive discussion of when rigid or discretionary GAAP is preferred. In the analysis 
of sections II- V, the contract between the current shareholders and their manager is assumed not 
to be observable to the prospective shareholders. In section VI, the analysis proceeds under the 


3 To justify this asymmetric treatment of managerial discretion in expense recognition and gross earnings recognition, 
we argue that while the manager may sometimes be capable of altering the timing of the recognition of future expenses, 
he may notbe able to alter gross earnings because his firm' s auditor may be able to detect manipulations of gross earnings 
casily. 
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assumption that the manager’ s contract is public information. Section VII considers how sensitive 
the results are to changes in a firm’s system of internal controls. That section offers a rationale 
for the recent proposal in the professional literature that would require firms to publish . 
information about the quality of the system of their internal controls. Section VIII summarizes our 
conclusions, indicates some potential extensions, and offers some caveats about the analysis: 
Proofs appear in the appendices. 


IL THE INTERNAL AGENCY PROBLEM 


The time line of events for the global problem is depicted in figure 1. After GAAP is specified 
(stage 1), the current shareholders contract with their manager (stage 2), and the manager 
privately makes his action choices (stage 3). Then, the firm's current period accounting reports 
are disclosed (stage 4). The manager is paid according to the terms of the contract (stage 5), and 
the firm is sold to prospective shareholders (stage 6). After purchasing the firm, the prospective 
shareholders make an additional investment in the firm, the return on which varies with the firm's 
first period economic earnings (stage 7). The prospective shareholders use their knowledge of the 
firm's first period accounting earnings while making this investment, since this is the only 
observable proxy for the firm's first period economic earnings. Finally, the firm's return on 
investment is realized, and shareholders are paid in proportion to their ownership stakes in the 
firm (stage 8). 

In this section, we confine attention to the "internal" agency problem that arises when the 
current shareholders write a contract to get their manager to adopt a particular set of actions at the 
lowest expected cost, using GA AP-specific earnings measures. That is, we examine a model that 
consists only of stages 1 through 5. 

In formulating this problem, we attempt to capture some of the range of practicing managers’ 
action choices, as well as how the accounting process measures the outputs associated with those 
action choices. Specifically, the manager is postulated to select actions that affect the current 
(first) period' s earnings in two ways: he can boost current period “gross” earnings (or revenues), 
and he can cut expenses that arise from current period operations that are not realized until the next 
period. We presume that the manager has no ability to alter the relation between reported and 
realized gross earnings. But, when GAAP allows for discretion in how to record current period 
expenses that are not realized until future periods, we provide the manager with the capability of ` 
intervening into this process. 

The two principal conclusions from this internal agency problem are as follows: first, when 
GAAP allows for flexibility in how expenses are reported, and the manager is requested to exert 
nontrivial effort to control expenses, the expected cost-minimizing contract always induces the 
manager to select the least conservative expensing procedure available, when he can intervene 
in the accounting measurement process. Second, if there is a rigid rule dictating how much 
expense must be booked in the current period (regardless of the firm's economic circumstances), 
it will be impossible to get the manager to exert nontrivial effort (at least in a one period setting) 
to control those expenses. 

The firm's current period economic earnings, y, are its gross earnings x net of estimated 
expenses c X x that should be recognized in period 1 but are not realized until period 2.* In other 
words, the firm's economic earnings are y = (1—c)x = dx, its gross earnings discounted by 
anticipated future (unrealized) expenses, with deDs (d,«-- «d, ], n2 2. 


* Throughout, arandom variable is distinguished from its realization by the absence of a subscript. Thus, x, is arealization 
of the random variable x. , 
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Neither the firm’s economic earnings y nor its constituent components (the gross earnings x 
and the discount d) is presumed to be publicly observable. Instead, the firm’s (first period) 
accounting earnings, defined by 
l yad-x, (1) 
where d denotes the reported discount and X denotes the estimated gross earnings, vend both 
components (d and X) are assumed to be observable. Accounting earnings imperfectly measure 
economic earnings here either because gross earnings are measured with error ( € proxies for x) 
or because expenses may not be properly matched to the sales generating them. This latter 
possibility amounts to the wrong discount factor being used to convert estimated gross earnings 
into estimated economic earnings ( d proxies for d). In this section, we take the measurement error 
involving gross earnings (X— x) as exogenous, and concentrate on sources of potential discrep- 
ancy between d and d. 

In practice, whether the reported discount factor d coincides with the actual discount factor 
d depends upon the range of the expensing procedures allowed under GAAP, as well as whether 
the manager opportunistically exploits any freedom in his reporting choice. In our model, an 
expensing procedure can be identified by its associated discount factor, so the amount of 
discretion in GAAP is determined by the set of allowable discount factors. We distinguish 
between two kinds of GAAP. 

Definition GAAP exhibits complete discretion if the set of allowable discount factor D 

coincides with D, the entire range of economically feasible discount factors. 

Definition GAAP exhibits rigid uniformity if the set of allowable discount factors consists 

of a single discount factor D= (d, }. 


The time-varying accounting treatment of R&D expenses provides an illustration of a phenom- 
enon similar to the one we are trying to capture with these two definitions. Prior to SFAS 2, firms 
could use their judgment in deciding how much R&D cost to capitalize. After SFAS 2, R&D 
expenditures had to be expensed, regardless of whether these expenditures created any assets. 
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That is, GAAP for R&D evolved from allowing (essentially) complete discretion to requiring 
rigid uniformity.° 

The tradeoffs between rigidly uniform GAAP and completely discretionary GAAP seem 
clear. Rigidly uniform GAAP prevents managers from communicating their firm’s actual 
economic position accurately, and so is disadvantageous. Completely discretionary GAAP may 
be disadvantageous as well, since the firm’s manager may exploit the discretion in GAAP and 
purposefully choose the wrong discount rate. However, if GAAP entails complete discretion, then 
the manager typically would prefer one expensing procedure over all others, independent of the 
realized value of the actual discount factor. So, if the manager faces no restrictions on his reporting 
choices under discretionary GAAP, there is de facto no difference between rigid and discretionary 
GAAP since, in each case, the observed expensing procedure would be uninformative about the 
true level of unrealized expenses. 

To make rigid and discretionary GAAP differ in an economically sensible way, we postulate 
that there is some random event that influences what the manager can report about the discount 
factor under discretionary GAAP.‘ When one random event occurs, referred to as “the auditing 
system is effective,” the report d must coincide with d. When another random event occurs, 
referred to as “the auditing system is ineffective,” the actual realization of d puts no constraints 
on the manager’s report d: in that event, d can be any discounting procedure allowed under 
discretionary GAAP, i.e., any element of D ={d, <- <d}. We assume that the auditing system 
is ineffective with probability p e (0, 1), that the auditing system's effectiveness is independent 
of the realizations of d, x, or X, and that the manager is presumed to know the realized values of 
d and Xand the auditing system's effectiveness prior to making his report." 

To complete the specification of the internal agency problem, we assume that the manager 
can independently influence estimated gross earnings X and the discount factor d applicable to 
those earnings by selecting a pair of actions (a, a) with a € A, = (a,,a.],a eA = {a,,a,}, 
a, «8, and a, « 8,. Thatis,ifq, (a )&Pr(d- dla, )andf, (a, )ePr(& = X, la) eNz (1 2 soil]? 
] eM =f 1:255 m n, m 2 2) denote the marginal dersities for d and : conditional ona anda, 
then the joint probability that d = d, and k= x, is given by q,(a,) X f, (a). The firm's shareholders 
never learn the manager s actions or whats the auditing System was effective. 

The manager's utility function is assumed to be separable in consumption and effort, with 
U(c)-g(8, ,a, ) denoting the manager’s utility from consuming c and adopting action pair (a_,a_). 
U(Jis strictly concave. The manager's opportunity cost of working for the firm is U. All 
shareholders are risk-neutral. Each of the family of densities (q(a)1a.€A, }, (f, (a) la, € A.] 
is nondegenerate? and possesses the strict monotone likelihood ratio property.? The conditional 
expectation Efx |X, a,] is nonnegative and increasing in a, for each Ê. 

The internal agency problem consists of determining how to get the manager to adopt any 
particular action pair at lowest expected cost when a particular GAAP regime is in effect. We start 


5 Rigid uniformity and complete discretion are polar extremes. While GAAP occassionally stipulates no discretion (as 
in SFAS 2) or complete discretion (APB 28 essentially allows any portion of quarterly expenditures for indirect expenses 
to be capitalized, at least on a quarterly basis), GAAP typically allows some, limited choice among accounting 
alternatives (e.g., LIFO versus FIFO, capitalized versus operating leases). The polar cases capture the trade-off between 
discretion and uniformity, but we acknowledge that both constitute stylizations of actual accounting standards. 

€ Nocorresponding notion of effective or ineffective auditing systems is required under rigidly uniform GA AP, of course. 

7 It is possible to modify the model further by assuming that the manager only gets to see an imperfect estimate of d. The 
substantive effect of such a change would be to add another noise term to the model. 

* That is, no single outcome occurs with probability one. 

? That is, q,(8,)/q, (a,) and f,(@,)/f;(@,) strictly increase in i and j respectively. 
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by assuming that the GAAP regime allows complete discretion. Any contract the current 
shareholders offer the manager is completely described by the utility v, from consumption (or 
equivalently, the cost h(v,) = U~ ! (v,) of providing that utility) that the manager receives when 
K=X , and the manager reports d= d,. In analyzing this problem, the current shareholders must 
anticipate how a contract will influence what report the manager makes when the auditing system 
is ineffective. Specifically, if the manager is offered the contract v={v,, lie N,j e M], then when 
k= X, and the auditing system is ineffective, the manager will claim that d —d,,, for some 
£(p E arg max v,. 10 Call this selection /(j) the manager's reporting policy. By using this 


knowledge of the IM response to a contract, the owners of the firm can decompose the 
internal contracting problem into two parts. In the first part, the owners find the contract that gets 
the manager to select both a particular action pair (a, a.) and a particular reporting policy ¢(-) at 
minimum expected cost. In the second part, the owners search over all reporting policies to find 
a policy that induces the manager to adopt the action pair (a,, a.) at least expected cost. 

The mathematical program that solves the first step for a fixed reporting policy /() and a 
fixed action pair (a „ a.) is stated below in Program 1(/,a.,a. ). 


The Discretionary GAAP Internal Agency Program 


Program 1 (Z, a_,a_) 
5^ a,,a,) = min (1— p, 2,a,(a,)f, (a, )h(v,) * pÈ f (a ACV 4), 
subject to : 


V «gj 2 Yg for all i e {1,...,n} and all je[1,...,mk; 


(C) (1- pL. Jf (a, )v, + 22 (8. )V 5 ~ &(8,, 8,)2 
(1 -PÈ Za, (8f v, *PA f (a)v s, — g(8;, 87) for all af, af; 
(IR) (7 2: Èa, (8f, (Qv, +d f (8 )V «5; — g(a,, a.) 2 U. 


The first set of constraints ensures that the manager will be encouraged to select the desired 
reporting policy; the second set of constraints (labeled IC) ensures that the manager will be 
induced to select the desired action pair. And the third set of constraints (labeled IR) ensures that 
the manager's minimum utility constraint is satisfied. If it is impossible to get the manager to 
adopt action pair (a, ,a, ) and the reporting policy /(.) simultaneously, then set &(£,a,,a,) =. 

Let C(a,,a,) denote minimum expected cost to the owners of getting the manager to choose 
the action pair (a,,a,). Letting L be the space of all functions /: M  N, then C(a,a, ) = 


Min &(/ a,, a. ). The following lemma characterizes the reporting policy and contract that solve 
teL 


the internal agency problem, i.e., that achieve the minimum C(a_,a, ). 


Lemma 1 Let v“ = (vi lie N,j e M) and ¢"(-) achieve the minimum C(a,,a.). Under completely 
discretionary GAAP, 


(a) if a, =ā,, then /'() =n and v} is strictly increasing in i for each j; 


(b) if a, =a,, then /' () is arbitrary and v;, is independent of i for each j. 


In general, /(j) will not be unique. Lack of uniqueness will pose no problem in what follows, since randomization of 
reports will not be optimal (because the optimization problem the current investors face constitutes a concave program). 
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The first part of the lemma states that to minimize the expected cost of getting the manager to adopt 
the “high” action a,, the manager should be encouraged to make the highest possible report 
regarding d when the auditing system is ineffective. This result is due to two offsetting effects. 
On the one hand, if i = /(j), then when the manager reports d = d ; and X= x p the owners cannot 
discern which of “d = d," or “the auditing system was ineffective" occurred. This contamination 
of the report d, reduces its informativeness about the manager’s action a, and hence its 
contracting value. If /(j) 2 n, i.e., the manager is not induced to make the report d, when the 
auditing system is ineffective, then the report d, is a more discriminating signal about whether the 
manager took the action @, than any other report, since q,(4,)/q,(@,) > q,(8.)/q,.(8,) for all 
i^«n when the distribution of d possesses strict MLRP. Consequently, inducing the manager to 
report d, when the auditing system is ineffective reduces the contracting value of the most 
informative report the manager can make. 

But, on the other hand, i= /(]) requires that the manager’s compensation from reporting 
anything other than d, when the auditing system is ineffective must be lower than that from 
reporting d, i.e., v, 2v. for all i^ must hold. These constraints bind least when i = n. This latter 
effect can be shown to be more significant than the contaminating effect mentioned previously, 
and so the conclusion of lemma 1(a) follows." 

When the manager is induced to adopt a. , the manager's choice among reporting procedures 
does not affect the manager's compensation (i.e., v, does not depend on 1), creating the 
indifference among Programs I(4,a,,a, ) described in lemma 1(b). 

When rigid GAAP is in place, a program similar to Program 1(¢,a,,a, ) can be constructed to 
aid in characterizing the optimal contract. But it is unnecessary to provide a detailed statement 
and analysis of such a program. When GAAP is rigid, the current shareholders cannot induce the 
manager to exert nontrivial effort to reduce expenses, because what the manager must report about 
dis always independent of the actual realization of d. Moreover, when the manager is induced to 
exert the effort pair (a.a, ), forany a. € A, , thecost of getting him to adopt that effort pair is given 
by C(8,,a,), just as in the case where GAAP entailed complete discretion. 


III. THE EXTERNAL AGENCY PROBLEM 


In this section, we study that facet of the global problem involving the relation between the 
equilibrium selling price ofthe firm (paid by prospective shareholders to the current shareholders) 
and the firm's current period accounting report. We assume away all agency problems between 
the current shareholders and their manager to concentrate exclusively on the valuation-related 
effects of the firm's accounting report. In terms of the time line in figure 1, we ignore the moral 
hazard problem associated with stages 2, 3, and 5 for the moment, and instead concentrate on the 
relationships among stages 4, 6, 7, and 8. 

The principal conclusions are as follows. When GAAP entails complete discretion, a new 
moral hazard problem occurs between the current and prospective shareholders: when the 
auditing system is ineffective, the current shareholders will instruct their manager to select an 
accounting reporting policy that gets the prospective shareholders to pay the highest possible ` 
price for the firm. We show that the only equilibrium reporting policy entails the manager always 
selecting the least conservative accounting procedure available. (When GAAP exhibits rigid 
uniformity, there is no corresponding moral hazard problem between the current and prospective 


" Results related to lemma 1 include Corollary 1.1 in Evans and Sridhar (1995) and Proposition 1 in Dye (1988) and Dye 
and Magec (1991). 

7 Observe that this is the same nonconservative reporting policy that solved the intemal agency problem when GAAP 
exhibited discretion. This will be an important fact, and will be used frequently in subsequent sections. 
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shareholders, since the manager is confined to select the same expensing procedure in all cases.) 
In addition, we obtain an explicit characterization of the equilibrium price of the firm as a function 
of the firm’s accounting report, for use in subsequent sections. 

For all of this section, both current and prospective shareholders take as given the action 
choices of the manager and the cost of compensating the manager. By purchasing the firm, the 
prospective shareholders acquire property rights to the firm’s second period economic earnings. 
To create a simple, but plausible, way for earnings announcements to affect prices in our model, 
we assume that the prospective shareholders’ return on additional investment in the firm depends 
on the level of first period economic earnings. We retain the assumption that the firm's (first 
period) accounting earnings are the only observable proxy for the firm's (first period) economic 
earnings. 

To be specific we assume that the firm's first (y) and second (y, ) period economic earnings 
are related through a particular investment function F(): y, = F(y, 9, D, where I is an additional 
investment the prospective shareholders make in the firm and q is a random variable independent 
of all other variables in the model.'^ While the results that follow will hold for many alternative 
functions F(-)P, we will confine attention to those functions F(-) that belong to a variation of the 
Cobb-Douglas class: 


F(y, 9, =[B/(B—DIk!/Boy()!-!/8 where k > 0, B» 1, and E[g] - 1.76 Q) 


For additional simplicity, we assume that the prospective shareholders do not need to hire a 
manager to run the firm for them." 


The External Agency Problem and Discretionary GAAP 


The analysis begins with the case where GAAP entails complete discretion. We assume that 
the prospective shareholders, like the current shareholders, do not know when the firm's auditing 
system is effective. Consequently, the selling price of the firm will vary depending upon (1) what 
the prospective shareholders infer about the auditing system's effectiveness from observing the 
firm's accounting earnings (and its constituent components) and (2) their conjecture about the 
manager's reporting policy. If £°(-) denotes the reporting policy they believe the manager has 
adopted, the selling price P, of the firm when d =d; and x =X, is given by: 


B Such an intertemporal dependence can be justified by assuming that either consumer tastes or the firm's production 
technology is, at least to some extent, stationary over time (e.g., if demand for a product is high in period 1, it is likely 
to be high in period 2; a production technology that is efficient in period 1 is also likely to be efficient in period 2, etc.). 

^ Classical accounting theorists should note that what we refer to as first period economic earnings in the text are not 
economic earnings in the Hicksian sense, as our definition of economic earnings does not take into account all wealth 
changes the owners of the firm experience as a consequence of events that transpire during the first period. Such readers 
may prefer to refer to what we have called the firm's first period economic earnings by some other suitable name, e.g., 
earnings unadjusted for capital appreciation. 

“The critical requirement that F must satisfy is that if “œ” (abstractly) depicts the information on which period 2 
investment is based («o consists of the firm's accounting earnings, the accounting procedure according to which the 
earnings were constructed, and prevailing GAAP), then the value of the objective function (evaluated at its optimum) 
of the program MIXEIEO STE can be writzen as some function $ = e(E[y 1 ©). 


The function F(y. 9,1) = [BB — Dik Boyd, )!-/B satisfies these requirements for any B. > 1, but so do many other 
functions. 

16 The restriction to E[9] = 1 and the scalar coefficient IPAP- 1)]kUP are adopted merely to simplify notation; deleting 
these restrictions has no substantive economic effect in what follows. 

V Or, what amounts to the same thing, that the function F(-) is a reduced form representation, with any costs associated 
with the second generation’s agency problem having already been “netted out" in the function F(- ). 
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foris /*(j), P, =Max E[F(d,E[x!%=%,,a,],9,D]-I (3) 
I 


= (/B — D)(d,BIx 18 = ,, a, f. 


When iz /*(j), prospective shareholders are convinced that the auditing system is effective, 
because they do not believe the manager would have reported that the discount was d, were the 
auditing system ineffective. Therefore, they believe that the applicable discount factor for period 
one earnings is d, and hence that their best estimate of period 1 economic earnings is 
d,E[x1 & 2 $,,(2,,a,)]. This estimate of economic earnings is used to calculate the optimal 
second period investment L Since all second period shareholders can make these same compu- 
tations, competition among them will drive the selling price of the firm to the amount appearing 
in the last line of (3). 

The situation is more complex when i = /*(j). In that case, the prospective shareholders 
believe that the earnings report d,x j could arise either because d — d, and the auditing system was 
effective or because the auditing system was ineffective. If these shareholders knew that the latter 
event occurred, then they would infer nothing about the realization of d otherthan what they knew 
from their prior beliefs, i.e., E[d| auditing system is ineffective, report d, is made, and a ]- 
Ejdi a,j. But, since the shareholders are unsure whether the auditing system was ineffective, they 
must use Bayes' rule to revise their beliefs as follows: 


fori-/Z/*(j: 
Pr(effective auditing system | d; is reported, a.) — (1— p)g, (a) (p * (1— p)a;(8,)); 


E[d ! d, is reported, a ] = m(a,,1,j) = [pE[d | a,j + (1-p)g,(aDd,V[p--(1—p)a,(8,)], and hence 
for i= £°(j), P, = Max B[F(m(a,,i, )B[x ! X, a.],9, D]-I 
I 
= (k/(B - 1))(m(a,, i, j)E[x 1X = X, a, pP. (4) 


Here, m(a ,i,j) is the expected value of the actual discount to be applied to gross earnings when 
the prospective shareholders believe the manager adopted action a,, reporting policy 
£°(-), and £°(j) =i. 

The prospective shareholders cannot observe the manager’s reporting policy, so the current 
shareholders will take prospective shareholders’ beliefs 2°(-) about it as given, and then induce 
their manager to select a reporting policy /(.) that maximizes the firm's expected selling price. 
An equilibrium occurs when the prospective shareholders' expectations regarding the manager's 
reporting policy coincides with the reporting policy the current shareholders encourage the 
manager to adopt. More formally, 

Definition For a fixed action pair (a,, a), a reporting policy /*(j) is an equilibrium in the 

external agency problem 
if for every j, £°() € arg max Pi» 
where P, is defined by (3) for is /*(j) and by (4) for i = 2°). 


The following lemma asserts that the only possible equilibrium reporting policy in the external 
agency problem entails that the manager select the least conservative expensing procedure when 
the firm's auditing system is ineffective, i.e.,2°(j)m™n,and it provides both necessary and 
sufficient conditions for such a reporting policy to be an equilibrium. 
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Lemma 2: Suppose that GAAP entails complete discretion in the external agency problem. 
(a) If there exists an equilibrium reporting policy, itis unique and givenby /*()s n. 
(b) The condition d, , S m(a,,1,-) is both necessary and sufficient for an equili- 
brium reporting policy to exist. 
The intuition for lemma 2(a) is this: the lower prospective shareholders believe the firm's realized 
expenses are, the more they are willing to pay for the firm. So, if prospective shareholders thought 
that the manager chose to use an accounting procedure corresponding to some discount d, « d, 
when the firm's auditing system was ineffective, then any manager interested in maximizing the 
selling price of the firm would thwart those beliefs by choosing tbe least conservative accounting 
procedure (i.e., the one corresponding to the discount d, when the auditing system was ineffec- 
tive). 
The necessary and sufficient condition d, , S m(a,,n, j) for an equilibrium to exist in lemma 
2(b) is likely to hold in practice, if the probability p that the auditing system is ineffective is 
sufficiently close to zero. 


The External Agency Problem and Rigid GAAP 


When GAAP is rigid, there is no concern about the existence or nature of an equilibrium 
accounting reporting policy for the external agency problem, because the manager has no 
discretion to exercise in this case. If rigid GAAP is defined by the discount factor D = (d, ) , and 
if gross earnings are given by x-X y» then the manager must report that period 1 accounting 
earnings are d, j regardless of the realization of d and the auditing system's effectiveness. 

When rigid GAAP prevails, nothing can be inferred about the actual realization of the 
discount factor from the reported value of the discount factor, so if prospective shareholders’ 
conjecture that the manager adopted effort a. to reduce expenses,!5 then their inference about the 
realized value of d after observing any earnings report dX will be the same as the prior, 

unconditional expectation E[d | a ]. Thus, the equilibrium expected selling price of the firm is 


P, = Max [(B/- D]kUPE[d la, JD! -/BE[x | $ 2 $,a,]-1 (5) 
I 
= (KB — 1) (Eid 1a, ]- E[x 1$, a, DP, 


when the prospective shareholders believe that the manager adopted effort a, to increase gross 
earnings. 


IV. THE GLOBAL PROBLEM 


We can now analyze the global problem in which the current shareholders take into account 
how the contract they offer the manager influences both the manager's action choices and 
reporting policy. The objective of the current shareholders is to maximize the sum of the firm's 
first period's economic earnings (over which they retain property rights) and the expected selling 
price of the firm, net of the expected cost of compensating the manager. In this section, all eight 
stages of the time line are taken into account. 

We show that the internal agency problem can be integrated with the external agency problem 
to define an equilibrium of the global problem, and we demonstrate that the equilibrium expected 
selling price of the firm can be expressed as a product of functions involving the firm's expected 
gross earnings and the discount factor to be applied to those earnings. 


! Of course, a, = a, is the only equilibrium inference under rigid GAAP. 
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The Global Problem and Discretionary GAAP 


When GAAP entails complete discretion, the accounting reporting policy that solves the 
global problem is clear, since the reporting policy that encourages the manager to adopt any 
particular action pair (at least expected cost) coincides with the reporting policy that maximizes 
the expected selling price of the firm, by lemmas 1 and 2. The only policy that always solves both 
of these problems is /*(-) = n, so this reporting policy solves the global problem as well.” 

The expressions for P, that appear in (3) and (4) specify the selling price of the firm, for each 
discount factor d, and each realization of estimated gross earnings X,, T conditional on the 
prospective shareholders’ perceptions of the actions (a , a.) that the current shareholders induced 
their manager to adopt. When the contract the current Shareholders offer to their manager is not 
observable to prospective shareholders, the current shareholders take the prices P, generated by 
these perceptions as given. It follows that, if the current shareholders induce the manader to adopt 
the action pair (aj, a 3 (which need not correspond to (a , a,)), then they will perceive the expected 
proceeds from the sale of the firm to be (calculated before the realization of either d or X): 


-DEGEVE EDP + (P+ 0 — pa, EE EDP (6) 
n j j 


To explain (6), note that the probability that the firm sells for P, i for i <n is the same as the 
probability of the joint event that the control system is effective, d = d, and X=X.. yj This 
probability equals (1— p)q, (a7 Xf, (a, ). And the firm sells for P Only when X =X. ; and the manager 
reports d = d; this latter event occurs with probability (p +0- pq, (az »f, (a1). 

Recall that, for a fixed level of investment, the firm's second period's earnings are an 
increasing function of its first period earnings (from (2) above). and the latter is a product of the 
firm's first period gross earnings and the appropriate discount applicable to those earnings (from 
(1) above). Consequently, one might conjecture that the firm's market value (in (6)) could itself 
be written as some function of the firm's first period gross earnings and the appropriate discount 
factor. By substituting the expressions for P, from (3) and (4) and rearranging terms, we can 
rewrite (6) as the product 

(25,2, ): EGE(a;,a,), (7) 


where $(a5,a.) = (1— p q,(at)d? -- [p-- (1— p)a, (a*)Im(a,,n, j)P is that portion of the selling 
An 
price that is influenced by the firm’s first period future expenses and 
EGE(2;,a,) = (k (B -DE f, (a3 (B[x 1 ža, D? (8) 


is that portion of the selling price affected by the firm's gross earnings. 

Thecurrent shareholders' problem then consists of deciding which action pair maximizes the 
sum of the firm's first period earnings and expected selling price. net of the cost of compensating 
the manager, i.e., their problem is to select (a5,a;) to maximize:” 


F[d la2]- E[x a7] - 8(a2,a,)- EGE(a2,2,) — C(at, a2). 


19 Dye (1988) and Dye and Magee (1991) establish the optimality of nonconservative income reporting by managers in 
models where there was no explict specification of GAAP, and no notion of an effective or ineffective auditing system. 
Antle and Nalebuff (1992) establish the potential nonconservative proclivities of auditors in an auditor-manager game. 

Note that since the first generation retains property rights over first period earnings, it evaluates these earnings 

E(dia,]- E[x la ] at the actions the manager actually adopts. 
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In equilibrium, the actions (a , a, ) the prospective shareholders conjecture the manager adopted 
must coincide with the actions (a.,a.) the current shareholders induce the manager to adopt. 
Formally, 


Definition Under discretionary GAAP, a global equilibrium is an action pair (ad,ad), 
satisfying (ad,ad) e arg por Big lat]: Ex 1 a7] - 6(a7, a4): EGE(27,a2) — C(a?, aš). 
aar 


The Global Problem and Rigid GAAP 


Under rigid GAAP, the reported discount factor is independent of the realization of the actual 
discount factor. Consequently, the manager will never exert nontrivial effort to reduce expenses. 
Thisimpliesthat a, = a, is a feature ofany equilibrium under rigid GAAP. It follows from pricing 
equation (5) that if the prospective shareholders conjecture that the manager adopted the action 
pair (3,,8,), when the current shareholders in fact encouraged the manager to adopt action pair 
(a,,a,), then the current shareholders’ expected proceeds from the sale of the firm are?! 


E[d 12,9 -(k (B —D)X fat EI a, DP = E[d 1,  -EGE(a7,2,). (9) 
J 


Note that the factor BE[dla,]P multiplying EGE(a;,a,) in (9) is strictly smaller than the 
corresponding factor 5(a7,a¢) under discretionary GAAP.” This smaller discount arises because 
the amount of information revealed about realized expenses is always lower under rigid GAAP. 
It follows from this comparison that if the values of the manager’s conjectured and actual action 
choices (a7,a. ) involving gross earnings are the same under rigid and discretionary GAAP, then: 


E[dia, ]P -EGE(a*,a,)<&(a",a,)- EGE(a*,a,), (10) 


that is, discretionary GAAP always produces a higher expected selling price for the firm. When 
we compare equilibria under rigid and discretionary GAAP in the next section, we will return to 
this observation. 

The appropriate definition of a rigid GAAP equilibrium now follows by analogy to the 
definition of an equilibrium under discretionary GAAP, using (9). 


Definition? Under rigid GAAP, a global equilibrium is an action pair (a,,a7) satisfying 
at e arg max E[dla,]- B[x1a7 ] E[d | a, ]P -EGE(a*,at)—C(a,,a*). 


x 


V. COMPARING GAAP REGIMES 


Discretionary GAAP appears to have several inherent advantages over rigid GAAP. It 
constitutes a superior basis for compensating the manager, as noted in section H, and it 
communicates more information to prospective shareholders. about the realized discount factor 
d, permitting them to choose their investments in the firm more accurately. This latter effect 
benefits the current shareholders as well, since it increases the expected selling price of the firm. 
It would seem, therefore, that discretionary GAAP is unambiguously better than rigid GAAP. 


^ This derivation is analogous to the derivation of (4) and (5) from pricing equations (1) and (2). 

Z See Claim 8 in the appendix for a proof. - 

2 Note that even though rigid GAAP is defined in reference to the single accounting procedure d,, the specifics of the 
procedure do not enter into the definition of equilibrium: since both the firm's first period accounting earnings d,, 
uu O OSE eve ducc) a ae estimated gross earnings X, can be 
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The Potential Superiority of Rigid GAAP 


It is a surprise that examples, such as the one presented in exhibit 1, can be constructed for 
which rigid GAAP produces both strictly higher welfare to the current shareholders and a strictly 
higher expected selling price for the firm than discretionary GAAP. The critical feature of such 
counterintuitive examples is that the set of equilibrium (orimplementable) action choices changes 
as the GAAP regime changes. In the example in exhibit 1, the unique equilibrium gross earnings- 
related action under rigid GAAP is a, and the unique equilibrium gross earnings-related action 
under discretionary GAAPis a, . In this example, the benefits of the higher effort to increase gross 
earnings under rigid GAAP are so large that they overwhelm any of the previously mentioned 
benefits of discretionary GAAP. 

One explanation for such examples, tied closely to how the example in exhibit 1 was 
constructed, follows. Suppose that, under discretionary GAAP, the manager can be induced to 
adopt either of the action pairs (a,,4, ) or (8,,8,) cheaply, and that it is prohibitively expensive 
to induce him to adopt (@,,4, ). (These assumptions also imply that, under rigid GAAP, it is cheap 
to get the manager to adopt the action pair (a,,4, ) whereas it is prohibitively expensive to get him 
to adopt either (4,,a, ) or (8,,8,)). When GAAP entails complete discretion, the manager can 
report information about the results of his effort to reduce expenses, and the (current shareholders 
will know that the) prospective shareholders will respond to these reports. In the example in 
exhibit 1, this sensitivity of the selling price of the firm to reports about recognized expenses is 
so pronounced that, for all equilibria, the current shareholders instruct their manager to adopt the 
high effort level a, . In such cases, (8,,8,) may be the only equilibrium action pair, even when 
the returns to a, are high. While the current shareholders might like to be able to precommit 
themselves to induce the manager to adopt the action pair (a,,a, ), they are unable to make such 
precommitments. 

In contrast, when GAAP is rigid, any equilibrium necessarily involves the manager selecting 
the action a,, since the (prospective shareholders know that the) current shareholders cannot 
induce the effort a8.. Inducing the action pair (a,,4,) is posited to be inexpensive, so the 
equilibrium under rigid GAAP will typically involve the action pair (a,,a, ). Consequently, if the 
returns to a, are high enough relative to the returns to 4, , the rigid GAAP regime will be superior 
to the discretionary GAAP regime. 

Another way of explaining the intuition underlying these examples was discussed briefly in 
the Introduction. In any agency problem, as the range of possible actions available to the agent 


^ For the parameter values of Exhibit 1, there is no equilibrium under discretionary GAAP with the action pair (8,,8, ), 
since—-if the prospective shareholders believe the current shareholders instruct their manager to adopt that action pair— 
then the current shareholders will in fact induce the manager to adopt that action pair (a,,a, ) as the following inequality 
holds: 


.0979 = E(d 18, ]- E[x 1a, ]+ô@ a )EGE(a,,a,)~C@,,a,)- 
(Eld 1a, ]- E(x 18, ]-- (a, 8, )EGEG, ,,) - C(a,,8,)) » 0. 
Yet, there is an equilibrium under discretionary GAAP with the action pair (a,,4, ), as the two inequalities: 


0016 = E[d à, ]- E[x 1a, ]-- (8,8, )SEGE(2,,8,) - C(8.,8,) - 
(Eid la] Elx 12, ]- &(a,, 8, )EGE(8,, 2,) - C(2,, 8,)) > 0 and 

.0069 = E[d | a, ]- E[x la,]+ 6(a,,4, JEGE(a, ,a,) ~C(a,,a,)- 
{E{dia,]-E{[xia,)]+8(a,,a, )SEGE(a,,8,) - C(a,,a,)] » 0, and hold. 


It can also be verified that the action pair (a,,2,) is an equilibrium under rigid GAAP, and that this rigid GAAP equilibrium 
dominates any possible discretionary GAAP equilibrium. The details are available from the author. 
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EXHIBIT 1 
Example for which Rigid GAAP Dominates Discretionary GAAP 


Technology producing current (first) period actual and estimated gross earnings 


xe{x, =0,x, «IExe ($,,$5):95,(8,) = Pr(x=Xq IK=K,a, sf s P(x =x] la, f = Pr(x =x, la,) 


q2 (8x3 qz (a, ) 2) (@,) do (8, ) Pr(X, lay) Pr(X; la.) f f 














16 18 62 635 .999 01 839 365 


Technology producing current (first) period expenses 


de (d, =.01,d, =.11};Pr(d = d, la,)=q=.96, and Pr(d =d, 13,) ed =.89; p=.01 


Technology producing future (second) period earnings 


B = 2; k = 500 


Manager’s preferences 
U(c)= ch; U = 6;g(2,,8,) — 0; g(8,,2,) 7.1; g@,,4, ) “large” and 
g(a.,4,) is chosen so that C(a,,a,)=C(a,,a,)* 
Equilibrium expected selling price of the firm and welfare of current shareholders 
under rigid GAAP with equilibrium action pair (a,,4,): 
Ejd la, P -EGEG, ,à, )=.039 E[d! a, P -EGE(&, ,,) — C(2,,8,) =.036 
under discretionary GAAP with equilibrium action pair (a,,a,): 


Ó(8 i) EGE(a, 2, ) 017 Ejd la, P -EGE(a, à, ) - C(à, ,a,) - 014 


* This is possible since: 


Cao) = (g(,,8,)/£ -D? x DE? - f0- £91; 
C(8,,8,)  (g(8,,2,)/(q —0))? x[q(q - L/( - p)? *£0- 1. 
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increases, some aspects of the agency problem are exacerbated: if a principal sought the agent to 
adopt some action e in some action set E, and the action set is subsequently expanded to EDE, 
then it becomes more difficult and/or expensive to get the agent to select this same action e. 
Applying this observation to the external agency problem, it follows that converting from rigid 
GAAP to discretionary GAAP can exacerbate that agency problem, because the action set the 
current shareholders can get the manager to adopt under discretionary GAAP expands. The 
summary lesson of these examples is that there may be a potential conflict between information 
environments that reduce internal agency problems and information environments that reduce 
external agency problems.? 


Superiority of Discretionary GAAP When Gross Earnings Are Measured 
Without Error 


Even though counterintuitive examples such as the one illustrated in exhibit 1 can be 
constructed, we establish below several theorems that identify sufficient conditions under which 
discretionary GAAP is superior to rigid GAAP. So far in the analysis, we have distinguished 
among GAAP regimes only in terms of how expenses are reported, and have taken as exogenous 
how gross earnings are measured. Some GAAP regimes could provide a more accurate 
assessment of gross earnings than others. Our first condition distinguishes between GAAP 
regimes by whether the publicly reported estimated gross earnings X is a perfect or imperfect 
proxy for the firm's actual gross earnings x. 


Definition GAAP is said to measure gross earnings without error if X = x.?$ 


The first theorem below reveals that if GAAP measures gross earnings without error, then 
discretionary GAAP is superior to rigid GAAP without regard to other problem characteristics. 


Theorem 1 If GAAP measures gross earnings without error, then the expected welfare of 
the current shareholders is strictly higher under any discretionary GAAP equilibrium 
than under any rigid GAAP equilibrium. 


That is, if one aspect of the manager’s output is measured error-free, then increases in reporting 
discretion along other dimensions of his output are welfare-enhancing. The counterintuitive 
example in exhibit 1 illustrated an instance in which, when one dimension of the manager’s output 
is measured with error, then increasing the manager’s reporting discretion along other dimensions 
is undesirable. So, theorem 1 and these counterexamples demonstrate that one cannot deduce the 
desirability of changing GAAP along one dimension without considering at the same time how 
other dimensions of GAAP are formulated. 

The intuition for theorem 1 comes from two observations: first, as was noted in (10) above, 
if the actual and conjectured gross-earnings related actions (a; ,&,) are the same under both rigid 
GAAP and discretionary GAAP, then the expected selling price of the firm is strictly higher under 
discretionary GAAP, because discretionary GAAP provides additional information about the 
firm's actual expensing experience. Second, when gross earnings are measured without error and 
accounting earnings are reported to be d x x, the selling price of the firm does not depend on 
prospective shareholders' conjectures the manager's gross earnings-related action a,, because 

measured gross earnings reveals gross earnings accurately, i.e., E[x | X, a, ] is independent ofa. 


3 Other papers have demonstrated the related result that information useful for decision-making/valuation purposes may 
be different from information useful for stewardship purposes. Early examples of such a result include Gjesdal (1981) 
and Dye (1985). 

?S'The condition X = x is stronger than what is actually required to prove theorem 1 below. All that is required for that 
result is that the weaker condition: E[x | X, a, ] is independent of a, for every X. 
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Therefore, the firm's expected selling price also does not depend on a.. So, if discretionary GAAP 
prevails and current shareholders induce the manager to adopt the same actions they would have 
induced him to adopt under rigid GAAP, they cannot be worse off than they would have been 
under rigid GAAP: the expected selling price will be strictly higher under discretionary GAAP 
and both the expected first period earnings and the expected cost of getting the manager to adopt 
the actions will be the same.”’ A fortiori, if current shareholders decide to induce the manager to 
adopt a different set of actions under discretionary GAAP than under rigid GAAP, they must be 
strictly better off, by revealed preference.” 


Superiority of Discretionary GAAP When The Cost Function Displays 
Decreasing Differences 

In the exhibit 1 examples, assumptions were made to ensure that the incremental cost of 
getting the manager to adopt high effort along both action dimensions was enormous, whereas the 


cost of getting the manager to adopt a high effort level along only one dimension of effort was 
negligible. The converse possibility is considered next. 


Definition The function C(a,,a,) exhibits decreasing differences provided C(a,,4,)—C(a,,a,) 
is nonincreasing in a, (Milgrom and Roberts 1994). 


This definition requires that the incremental cost of getting the manager to exert the high effort 
level along one dimension is decreasing (technically, nonincreasing) in the other dimension of the 
manager's effort. A sufficient condition for C(-) to display decreasing differences is that the 
disutility of effort function g(a,,a_) be of the form g(a,,a,) = &(Min{a,,a,}) for some increasing 
function g(-). This representation of the manager’s disutility of effort may be appropriate, for 
example, for those tasks that merely require the manager’s presence for their completion: if the 
manager is present to perform one task, he may be present to perform other tasks at no additional 
personal cost too. 

Theorem 2 presents another set of sufficient conditions to ensure the superiority of 
discretionary GAAP, based on the preceding definition. 


Theorem 2 If, in each GAAP regime, there exists a unique equilibrium action pair and, in 
addition, any one of the following conditions holds, 
(i) C(a,,a.) exhibits nonincreasing differences when GAAP is discretionary; 
(ii) ad zar; 
(iii) the manager's "x"-action set A, is a singleton, i.e., a, —8,, 
then the expected welfare of the current shareholders, as well as the expected selling 
price of the firm, is strictly higherunder the discretionary GAAP equilibrium than under 
the rigid GAAP equilibrium. 
The intuition behind all parts of theorem 2 is implicit in the second condition of its statement: 
ad 2a'. The exhibit 1 example suggests, and conditions (ii) and (iii) of theorem 2 confirm, that 
rigid GAAP's potential superiority over discretionary GAAP rests on the manager's equilibrium 
gross earnings-related action, a‘ , underrigid GAAP being strictly greater than the corresponding 


?'Note that the preceding conclusion does not hold in general when GAAP measures gross carnings with error, because 
—in general—the expected selling price of the firm would depend not only on the gross earnings-related action choice 
a’, but also prospective shareholders’ conjecture a, about the gross earnings-related action the current generation got 
their manager to adopt. 

Z Also note that the revealed preference argument works here only because the selling price of the firm does not depend 
upon the prospective shareholders’ conjectures about the manager's gross-earnings related action. It does not hold 
otherwise. 
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equilbrium action, a2, under discretionary GAAP. That condition (i) also ensures af 2al is a 
result of the equilibrium expense-related action under discretionary GAAP, ad , being at least as 
large as the equilibrium expense-related action, a! = a „under rigid GAAP.” This fact, coupled 
with the decreasing differences assumption, ensures that the incremental cost of inducing the 
manager to adopt the high gross earnings-related action 8, under discretionary GAAP will be less 
than the incremental cost of inducing this action under rigid GAAP. Therefore, the equilibrium 
gross earnings-related action will be higher under discretionary GAAP. 


VI. PUBLICLY OBSERVABLE MANAGEMENT CONTRACTS 


When the contract between the manager and the current shareholders is observable to the 
prospective shareholders, the prospective shareholders can infer what actions and reporting 
policy the manager will adopt. Further, they can change their inferences as the contract changes, 
so questions about the existence of equilibrium reporting policies, contracts, or actions are moot 
in this case. 

When the contract is publicly observable, current shareholders behave as if they retained 
ownership of the firm in the second period, because they value any changes in the contracting and/ 
or investing environment in exactly the same way as prospective shareholders. We have already 
argued that switching to discretionary GAAP leads to an improvement in the contracting 
environment. Since discretionary GAAP allows investors to better tailor their investment 
decisions to the specifics of the firm's expensing history, it improves the investing environment 
as well. 


Theorem 3 If the manager's contract is observable to prospective shareholders, then the 
expected welfare of the current shareholders, as well as the expected selling price of the 
firm, is strictly higher under any discretionary GAAP equilibrium than under any rigid 
GAAP equilibrium.” 


VII. ENDOGENIZING THE PROBABILITY THAT AUDITING 
SYSTEM IS EFFECTIVE 


It is straightforward to show that the expected cost of compensating the manager is strictly 
decreasing in p, the probability that the auditing system is ineffective, whenever the manager is 
induced to select a, =a,, and is independent of p if the manager is induced to select a, — a, .?! 
If the probability p is publicly observable, it is also easy to show that, under discretionary GAAP, 
the expected proceeds from the sale ofthe firm, are also strictly decreasing in p for any fixed action 
pair (a, a,).** If, as one would expect, the direct cost of operating an auditing system is decreasing 
inthe probability that itis ineffective, then the optimal choice of p trades off this cost against these 
benefits. 


* The argument that follows is informal. A formal argument may be found in the appendix. 

* The question remains as to which of the assumptions of observable or unobservable contracts is the more reasonable. 
While published data exist concerning the actual payment of managers, this is not evidence of a contract's observability, 
since what counts is the functional form of the contract, not the ex post payment. While some of the factors. that influence 
senior management' s compensation are discussed in proxy statements, many details of these compensation arrange- 
ments are not articulated, and boards of directors seem to have considerable letitude in deciding when bonuses should 
be given to senior executives and how large the bonuses should be. It seems reasonable to us, therefore, to argue that 
the assumption of the unobservability of the manager’s contract to outsiders is the descriptively more realistic 


assumption. 
31 See the end of the appendix for a proof of this claim. 
% The proof is available on request. 
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When the probability p is chosen privately by the current shareholders (and the prospective 
shareholders have to infer this choice), the situation is quite different. This 1s easiest to see when 
the current shareholders are only concerned about selecting a value for p that maximizes the 
expected selling price of the firm net of the cost of compensating the manager (i.e., when the 
expected first period earnings are negligible). In that case, the current shareholders always choose 
p= 1 in equilibrium because, no matter what the prospective shareholders’ conjectures about the 
dependence of the market-clearing price of the firm on the realized values of dand Å (i.e., 
regardless of P,,), the current shareholders can maximize the expected selling price of the firm 
at minimum cost by having an auditing system that never limits the manager’s reporting 
behavior.” 

When the current shareholders take into account the impact of the probability p on the firm’s 
first period earnings, an equilibrium choice of p < 1 may result, depending on how big an impact 
the expense-reducing action 4, has on current period earnings (note that p< 1 is required to induce 
a, >a,). But, as the argument in the previous paragraph illustrates, the current shareholders’ 
choice of p is too high relative to the social optimum, since a more effective auditing system 
creates a positive externality on the prospective shareholders. This effect may explain why the 
professional literature has recently proposed forcing firms to disclose information about the 
quality of the system of their internal controls Journal of Accountancy, October 1993, 20). 

' A Finally, note that under rigid GAAP, the current shareholders would never pay to reduce the 
probability that the auditing system is ineffective, since the manager can only be induced to adopt 
the low effort “d”-action a, under rigid GAAP and the expected proceeds from selling the firm 
do not vary with the auditing system's effectiveness. 


VHI. CONCLUSIONS, EXTENSIONS, AND CAVEATS 


We have considered the incentive effects, the investment effects, and more briefly, the 
distributional effects of expanding firms’ discretion in reporting accounting earnings. Expanding 
discretion makes it easier to contract with a manager, because it increases the number of 
dimensions along which his performance can be judged. Expanding discretion also improves the 
efficiency of investment. These two salutary effects of expanding discretion are insufficient to 
demonstrate its general superiority, however, because the actions of the manager may change as 
GAAP regimes change, and without imposing additional restrictions on the economic environ- 
ment, the desirability of augmenting the amount of discretion in GAAP is not clear. 

While examples can be constructed that illustrate the potential superiority of rigid GAAP, we 
identify several sufficient conditions that ensure the superiority of expanding discretion. These 
include: the ability to measure gross earnings without error, having the cost of compensating the 
manager display decreasing differences in effort, restricting the dimensions of the manager’s 
actions to those that affect the variable over which increased discretion is being allowed, and 
having the contract given to the manager be observable to potential shareholders. From an 
accounting perspective, the most important of these is the one entailing measuring gross earnings 


? More formally, if (P,.lie N, j € M) are the proposed equilibrium market prices and £(j) e arg max Py then the first 


generation can induce the manager to adopt the reporting policy /() by setting p = 1 rue der iia 
væ{v lie N,je M) thatis independent of i. When p= 1, the manager can always report that d,,.. occurred, regardless 
of the zctual realization of d. This choice of p= 1 and this form of contract are clearly cost-mi among all possible 
contracts, because they both contribute to encouraging the manager to take minimum effort along the “d”-dimension. 
Since this contract and choice of p maximize the firm’s expected selling price and minimize the expected cost of 
compensating the manager, they constitute an equilibrium. 
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without error, a variable over which the manager is presumed to have no reporting discretion. The 
finding here that one cannot ascertain the desirability of expanding discretion in one dimension 
of GAAP without considering how GAAP measures other dimensions of the firm’s performance 
may have important implications for piecemeal evaluation of changes in GAAP. From an 
economic perspective, the result that discretionary GAAP is superior as long as the contract 
current shareholders give to their managers is observable to prospective shareholders is also 
important. It suggests that clear communication of the details of management’s incentive 
compensation to prospective investors, coupled with improved accounting disclosures, can be 
strictly welfare-enhancing. 

Changing GAAP regimes can have many repercussions, not all of which are evaluated in our 
model. For example, information disclosure can have cash flow effects on firms other than the 
ones making disclosures. Such proprietary effects of disclosures may manifest themselves in the 
course of complying with financial reporting requirements, and these effects have not been 
evaluated here. Also, increases in the observable discretion in GAAP might change the amount 
of unobservable discretion in reporting choices: if one is operating in an “anything goes” 
environment, there are additional ways to alter what is reported besides merely choosing among 
GAAP in a self-interested fashion. Changing GAAP also can have redistributive consequences 
(e.g., Demski 1974 and Revsine 1991), and we have explored the political aspects of such 
consequences in only a cursory fashion here.?^ Finally, standard setters may restrict accounting 
choices so that the market develops a history of the relation between a firm’s report using a 
particular procedure and subsequent economic events. Accumulating such a history on account- 
ing procedures could be useful for both intertemporal and interfirm evaluations. From this 
perspective, standardizing GAAP results in a form of network externality, atopic of considerable 
recent interest to economists (see, e.g., Farrell and Saloner 1985 and Katz and Shapiro 1985). 


APPENDIX: PROOFS OF THEOREMS 
Proof of lemma 1 


Define the following program. 
Program 2(a,,a, ) 


X = Min (17 P)2 2144 (8,)f Gb Cn) PHA (A, Bly), 
Y 
Vaj 2 Vi for alli e{1,...,n} and allj e {1,...,m}; 
subject to: 
(IC) (1-p) 22g; (a, F(a, Vy + PLE (a, Veo; T g(2,,8,)2 
ij j 
(1- pX X, (af; vy + Ph f, (a. )v.., — g(a,,a; ) for all a7, ay; 
j 
QR) (- PEGA), + PE f G, Waj - g(2,,8,) 2 U. 


* The redistributive effects of GAAP in our model are limited because the market for the firm's shares is competitive, so 
the (assumed) risk-neutral prospective shareholders purchase the firm for the expected value of its second period 
economic earnings, based on whatever information is available to them. All the benefits, and costs, of changing the 
amount of discretion in GAAP fall on those current shareholders who have property rights over the firm before GAAP 
is altered. 
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This program is a relaxation of program 1(Z,a,,a,): rather than having the manager report d tG) 
when the auditing system is ineffective, the manager is presumed to announce "oo." That this is 
a relaxation is clear, since v of could be defined to equal v apy 

We prove lemma 1(a) for the case in which a, =a, anda, = 4, . (The proof for the case in 
which a, =a, anda, =a, is similar and omitted.) 

Let v={v,lie N, j eM) be the unique solution to program 2(a,,8,) (uniqueness follows 
since the program is convex). 

There are three action-related IC constraints in program 2(3,,8,), one each for (a,, a) 
—(8,,8,),(8,,2,),(8,,2,).5 Let the multipliers on the IC constraints for these three action pairs 
be respectively pH, H. Also, let À be the multiplier on the IR constraint, and let A, be the 
multiplier on the constraint Vej Z Vij- 

The first order condition for vj,i«ee, is (N.B. all of the multipliers in this first-order 
condition are nonnegative): 


h'(vi) = —i Ma ~ p)q;@,)f,(@,)+ ub —q,(8,)/q,(a3,)]* pa [l -f (a,)/£,(@,)]+ 
pil—q;(a,)fj(@,)/4,@,)f,@, +A. 


Claims 1—6 below pertain to the optimum of program 2(a,,a)). 


Claim 1 If Vi < V then T < Vy for alli Si. 


proof Suppose to the contrary that v> v, for some i^« i. Then, the constraint v_, 2 vis not 
binding and the multiplier” y 15 zero. Then, the first-order conditions for v and v,, reveal that 


(hb (v3) 2 pil- gy (@,)/ gy E+ BTE f; (8,)/ fj (8, ] 

+ ull- q,(a,)f;(a,)/ qz (a, )f,(a,)]9 Azk (vj) 
(the first inequality follows A, — 0 and from MLRP of the family {q;(a,)!a, € {a,,4,}} and the 
second follows from A, e 0). Since hÇ) is convex, it follows that Vi 2 vy 8 contradiction. 
Claim 2 If Vu Sv wf then MT is weakly increasing in i” fori” S 1. 
Claim 3 If v, = Vy» then v, =V for all i 2 i. 
(The proofs of claims 2 and 3 are similar to that for claim 1 and are omitted.) These claims establish 
that M is (weakly) increasing in i for each j. 
Claim 4 At least one of tte multipliers u, or 4 must be positive. 


proof If both of these multipliers are zero, then neither of the corresponding constraints is 
binding. The nonbinding constraints of a convex program can be deleted without affecting its 
optimum. That is, v = (vli €N, j eM] also solves the program which replaces the original IC 
constraints that appear in program 2(a ,a ) with the single IC constraint: 


J J 
1 J J 


* None of these will be redundant in general, since we have not specified g(a , a.) in detail. 


410 The Accounting Review, July 1995 


Note that this last IC constraint corresponds to the situation in which the action 4, is no longer 
considered to be a decision variable for the manager, since the action 8, appears on both sides 
of the IC constraint. So the optimum of the program with this substituted IC constraint has the 
property that the manager's compensation does not vary with i, since it is both unnecessary and 
undesirable to impose risk on the manager to get him to take the action 8, . Recall that the optimum 
of this substituted program is known to be the optimum v = {v} l i € N, j eM) of the original 
program 2(a,,a,). But we know that, if v, does not vary with i, the manager will not be induced 
to adopt the effort a, in that original program. This contradiction proves that at least one of the 
multipliers jj, or į} must be nonzero, as claimed. 


Claim 5 For each j, at least one of the constraints v, , 2 v, binds. 
proof Suppose none binds, so Ay = 0 for alli. Then the first order condition for v , is: 


bv.) = Gy +D- G) fE). 


By claim 4, at least one of the multipliers 4., pis positive. Couple this fact with MLRP, which 
implies that q,(a,)/q,,(@,) < 1,* to conclude: 


h'v y) = [1 q,(8,)/ q, &,)] Hall -£,@,)/£,@ 1+ MEE — 9, (af, (8,)/ , (,)£,(8,)] 4 » 
Ba D - 1] po. f; (a, )/ f, (8,)]- uLE 7 1: f, (3,)/ f; (8, )] - À = 
(H2 +w -f @,)/ f @, +A =h (vaj) 


This implies v,, > v_,, a contradiction. 


Claim 6 v_,= v,, for each j, and v, is strictly increasing in i fori <n. 

proof As v, is known to be weakly increasing in i (from claims 2 and 3), and there is some i for 
which Va Ya (by claim 5), the first result follows. The second result follows because the first 
inequality in (*) in claim 1 is strict when at least one of the multipliers p,, u is positive (which 
we know to be true by claim 4). i 


"Claim 7 If 7() =n, then Min (48, à,) - N  5(,8,,,) and for every £+ Z,E(,8,,8,)» R. 
e 


proof By claim 6, v,» V.» so program 2(8,,,) is identical to program 1 (4,8,,8,). Thus, 
Min EC, &,8,)€5(/,a,,3,) 2. But, since program 2(8,,8,)is a relaxation of program 
1(4,a,,a,) for any £, we know N s Min &(4, a,.a,). This proves the first half of claim 7. 

€ 


To prove the second half of claim 7, let v(f)= (vy (lieN,jeM)bethe solution to program 
1(¢,a,,8,) for any £# £. Recall that v ={v l ie N, j eM] uniquely solves program 2(8.,,). 
Thus, since v( £) is feasible for program 2(8,,8,) (set v,, = Vig} if vt £) and v, do not 
coincide for every i € (1,...,n] and every j e (1,...,m), then 5&(4,8a,,3,) > X= G(Z,a,, a. ). 


"Proof: suppose q,(8,)/q,(a,) S1. Since q;(8,)/q,(a,) is strictly increasing in i, we have q,(@,)/q,(a,)< 1 for 
all i, and hence 1 — $3, (8,) «9,q,(8,) 71, a contradiction. 
i 
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Pick j with /(j) < n = &j). If v s (/) + Vyp we are done. Alternatively, if Vapi) = V qj, 
then 
vy D S Veep O = Vy < Vay 


(the weak inequality follows since v( £) is optimal—hence—feasible for program 1(4,a,,a, ) and 
the strict inequality is from claim 6), so Vij (s v, and we are again done. cl 


These seven claims collectively prove lemma 1(a). 


proof of lemma 1(b) 
Letv={v,| €N, j eM) solve program 2(a,,a,). Define V; = Voy and, fori e, Vy s È Vijay (a.) 


(the sum is taken across i’ e N). Notice ¥y S ¥_, for alli andj and ¥= (9, li e N,j e M) satisfies 
both the IR and IC constraints for program 2(a,,8,). Since program 2(8,,8,) is convex, it 
follows that ¥ must be of strictly lower cost than v unless V; does not depend on i. 

Similarly, by constructing V ;20- pv, t pv, , we can reduce the expected cost of compen- 
sating the agent unless V, =V_, for all i and j. 

This shows that the optimum v &(v,!ie N,j e M) of program 2(a,,a,) does not depend 
on i. It then follows that any of the programs 1(4,8,,8,) can replicate the cost of program 
2(8,,8,) via a contact that does not depend on i either, which proves the lemma. tl 


proof of lemma 2 


For any reporting policy /?(- and i = £°(j), whether P, is bigger than P for i” + i is determined 
by whether m(a „i. j) is bigger than d, But, as m(a.,i,j) is a weighted average of E[d | a ] and d, 


m(a ij) « d. 


Thus, unless ¢°(j)sn, the prospective shareholders’ conjectures about the firm's reporting 
policy will not be fulfilled. This proves part (a). 

To prove part (b), notice that if d. , > m(a ,n,j), the manager would be better off reporting 
d- d, , than d- d, when the auditing system is ineffective. And if d,, S m(a nj), then 
obviously, Max(d, i +n} S m(a_,,n,j), so the manager will have no incentive not to report d = d, 
when the auditing system is ineffective. This proves part (b). Oo 


Claim 8 If an equilibrium exists under discretionary GAAP, then: 
8G,,a,) > 6(8,,8,) > d@,,a,) > Eld la, f. 


proof Recall m(a,,n,j)= [pE[d | a,] + (1-p)q (a )d Vip + (1-p)q,(a,)]. The last inequality in the 
statement of the claim then follows directly from Jensen’s inequality: 


5(a,.2,) = (=p) Zq Gd} +[p+(1—p)q,(a,)im(a,,n, P 2 
(0-p) Ea, @.)d; t [p (1- p)a, (a, )Im(a,, n, ))? = Eld la, f, 


with the inequality being strict unless p = 1. 
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To prove the first inequality, note that since an equilibrium is assumed to exist under 
discretionary GAAP, we know—by lemma 2—that m(a ,n,j) 2 d, ,. Hence, the sequence d,, d.,,..., 
d, ,, m(a,,njj)is an increasing sequence. By MLRP, it follows that 0(a5,a, )is increasing in at. By 
MLRP again, we know that q, (@,) » q,(a,) and E[d! a, ]> B[di a,]. Sinced, » E[d | a ] for all a, 
it follows that m(a,,n,j) is strictly increasing in a,. Hence, 5(a;,a,) is increasing in a, too. This 
proves the claim. d 


proof of Theorem 1 


Since GAAP measures gross earnings without error, X. j= = E[x!x pas ] is independent of a. 
Therefore, EGE (a*, a.) depends only on the action ar the current shareholders induce their 
manager to adopt, and not on the action a, the prospective shareholders conjectured the manager 
adopted. To acknowledge this, we write EGE( a”) in place of EGE (aa, ). 

Consequently, the definition of an equilibrium action pair under discretionary GAAP when 
gross earnings are measured without error simplifies to: any pair (a4,a¢) satisfying 


(ad,ad)eargmax E[dla?]: E[x |a; ] - &(a;, ad)EGE(a; ) — C(a;, a7). 
ala; 


And the definition of an equilibrium action pair under rigid GAAP simplifies to: any pair (a,,a‘ ) 
satisfying a! earg max  E[d!a, ]- E[x la; ] - Ed 1a, P : EGE(a7) — C(a,, 7 ). 


ay 


Now, from claim 8, Efdla a, Jf « (aca, )holds for all possible actions a, € A. So, if a7 is 
an equilibrium action under rigid GAAP, and (a ,ad) is an equilibrium ‘action pair under 
discretionary GAAP, then 


E[d la, ]: E[x la: ]+ B[d |a, P EGE(ar) - C(a,, ar) < 
E[dl2,]- E[xlaz ]-- 0(a,,ad) EGE(ar ) - C(a,, ar) S 


max Efdla*]-E[x|!a%]+6(a*,ad)- EGE(a* )— C(a*,a®) = 
ai. al 
E[d lag]: E[x 1a4]-- (ad, a2) EGE(ad) — Cad, ad) 


(the last inequality follows by definition of an equilibrium). The first and last expressions in this 
string of inequalities are, respectively, the expected welfare of the current shareholders under 
rgid and discretionary GAAP. So, these inequalities demonstrate the strict superiority of 
discretionary GAAP. 

proof of Theorem 2(ii) 


Suppose af 2a! . Then, since (a2,a4) is an equilibrium under discretionary GAAP, the current 
shareholders must be better off having their manager adopt action pair (a¢,a‘) than action pair 
(a5, a? ) (a,, aT), and therefore it follows that: 


E[d! ad]- E[x 1a2] - 8(a3, a2) EGE(a2, a3) — C(ad, a1) 2 
E[d a, ]- [x a3] - (a, a$)EGE(a; a2) — C(a,, a; ). un 


Since EGE(a‘,-)is increasing, af 2a! implies EGE(a^,31)2 EGE(a‘,a'). From claim 8, 
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6(2,,a3) 2 5(a.,a,) > Efdla,]?. Combining these last two inequalities, we conclude 
d(a,, a?)EGE(a‘ a?) zi C(a., ar) > E[d | a, PEGE(a! , a7) ~ C(a., ay). 
Now add E[dl a. ]- E[x | a? ] to both sides of this last inequality to get 


E[d 1a,]  E[x lat ] - 6(a,, ad)EGE(az, a2) - C(a,, ar) > 
E[d |a,]- E[x lar] - E[d a, P -EGE(at,at)—C(a,,a!). (12) 


By combining (11) and (12), we conclude: 


E[d | a4]. E[x | a2] - 6(a2, a1) EGE(a3, at) — C(ad, ad) > 
E[d |a, ]- E[x la? ]+ E[d! a, [ PEGE(a? a? ) - C(a,, aT), 


which proves that the first generation is strictly better off under discretionary GAAP. Moreover, 
since EGE(,-) is increasing in both of its arguments, and 6(a‘,a‘) > E[d | a, P forall af, it follows 
that the expected selling price of the firm is strictly higher under discretionary GAAP as well, 
i.e., 6(a2, ad )EGE(a2, a?) > E[d | a, PEGE(a* , a! ). g 


proof of Theorem 2(i) 
If a’ =a,, then a? 2a’, and the result follows from Theorem 2(ii). So, we shall suppose for the 
remainder of the proof that a? =a, is the equilibrium under rigid GAAP. Since this equilibrium 
is, by assumption, unique, a! = a, isnotanequilibnumunderrigid GAAP. This latter fact implies 
that: 


E{dla,J-E{x!a,]+E[dla,]’ -EGEG,,a,)—C(a,.a,) > 
E[d!a,]- E[x | a,]-- Ed |a, ]P -EGE(a,,a,)—C(a,,a,), 
or equivalently, 
E[d la, P(EGE(&, ,a,)—EGE(a,,a,)}> E[d a, ]- (ix la, ]- E[x 12, ]) ^ 
C(a,., 8, ) - C(8,,a, ). (13) 
Then, 5(a,,a,) > E[dl a P ensures: 
a. a.) EGE(8,, a, ) - EGE(a;,a,)} 2 E[d a, 1: (BIx | a, ]— HIx | a.])- 
C(a,,a,)-C@,,a,), 
or equivalently 
E[d!a,]- Ex a, ] - 6(a,, a, EGE, ,a,)~-C(a,,a,)> 
E[d a, ]- E[x |a, ]- 9(a,, a,)EGE(2,,a,) — C(a,,a,) (14) 
and hence (a,,a, ) is not an equilibrium under discretionary GAAP. 
Likewise, 8(a,,8.) » Edla, P and inequality (13) imply: 
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ôG., a )(EGE(a,,2,) - EGE(a,,a,)}> Eid 1a,] {Elx 12, ]~E[x!4,]}+C(a,,a,)-C(a,,a,)2 
E[d ta, ]- (Ex a, ] - E[x! a, ]) +C, 8, ) - C(2,,2,) 


(the last inequality follows from C(.,) possessing decreasing differences and from 
E[x la ] 2 E[x! a, ] and E[d1 a, ] 2 Eld La, ]). Upon rearrangement, this last inequality yields: 


E[d1a,]- E[x |à, ]-- (8, a, )EGE(8,,2,)— C(8,,8,) > 
E[d à, ]- E[x 12, ] - 8(8,, a, EGE(a,,a,)-C@,,a,), 


i.e.,(8,,8,) is not an equilibrium under discretionary GAAP. 

Thus, since there exists, by assumption, a (unique) equilibrium under discretionary GAAP, and 
since we have shown that neither (a,,a, ) nor (8.,8,) constitutes an equilibrium, we conclude 
that the equilibrium “x”-action under discretionary GAAP, a‘, must bea? =a, 2a', Theorem 
2(1) then follows from Theorem 2(ii). 


proof of Theorem 2(iii) 
This is trivial by Theorem 2(ii): A, being a singleton necessarily implies that af 2 aT. Ci 


Proof that the expected cost of compensating the manager is strictly decreasing in p 
Let “AS z'p" be short-hand for “the auditing system is ineffective with probability p." Let 
v= {v,lie N, j e M} be the manager's optimal contract when AS = p^, and suppose p < p^. 
Construct a new random variable, say r, independent of all other random variables in the model, 
such that r = 1 with probability (1—p^Y(1—p) and r = 0 with probability (p —pY(1-p). Create the 
following r randomized contract when AS = p: if the auditing system turns out to be effective, d 
=d, and X= Xj, then pay the manager v, ifr=1 and pay him v, if r= 0. If the auditing system 
is ineffective, d = d,, and K=X j then pay / the manager v, . . Observe that the manager's expected 
utility from adopting any action pair (a,,a,) when given this contract and AS = p is: 


- pt - p»/- p q,(a,)f,(,)vi + (p' - pd - pL f (a 0v, + pÈ f,(a,)v, 7 8a) = 
ü- pZ a; (a.)f, (a.v + 2 f(a, Yaj — g(a,;8,). 


This is the same as the manager’s expected utility under the original contract when AS = p°. It 
follows that: the manager's IR constraint is satisfied with this contract and AS = p; whatever 
actions the manager was induced to take with the original contract when AS = p^ will also be 
induced with this contract and AS = p; and the current shareholders’ expected cost of compen- 
sating the manager with this randomized contract when AS = p is the same as with the original 
contract v when AS = p^. These observations demonstrate that the randomized contract is feasible 
for program 1 (7, a,,8,) when AS = p( and /() = n). But the randomized contract is clearly not 
the solution to that program, since that program, being convex, does not have a randomized 
solution. This proves that the expected cost of compensating the manager when a. — à is strictly 
decreasing in p, as claimed. 

When a, = 8, is being induced, the optimal contract v = {v} li € N, j €M} to give to the 
manager is independent of i, so the probability that the auditing system is ineffective is irrelevant 
in that case. 
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L INTRODUCTION 


or over two decades, empirical research has documented similarities between price and 
F volume reactions to earnings announcements. Beaver (1968) found that earnings 

announcements generate both abnormal price changes and abnormally high trading. 
Subsequent research has shown that the magnitude of both price and volume reactions increase 
with unexpected earnings, and decrease with firm size. The focus on similarities between price 
and volume reactions has naturally led researchers to view price and volume as substitute 
measures of “market reaction.” Significant conceptual differences exist, however, between price 
and volume reactions to informative disclosures. Recent research has formalized Beaver’s (1968) 
intuition that price changes reflect the change in the aggregate market’s average beliefs, while 
trading volume is the sum of all individual investors’ trades (Kim and Verrecchia 1991a). Trading 
volume preserves differences among individual investors that are “cancelled out” in the averaging 
process that determines equilibrium prices (Kim and Verrecchia 1991a). Given these differences, 
it is possible that some earnings announcements will generate heavy trading but small price 
changes, and vice versa.! 

Surprisingly, there is little empirical evidence concerning differences between the magni- 
tudes of price and trading volume reactions to accounting disclosures, even though research has 
acknowledged the possibility of differential price and volume reactions (see, for example, Beaver 
1968; Lev and Ohlson 1982; Dontoh and Ronen 1993). We address this issue by providing 
evidence on two questions. The first question is: How often do earnings announcements generate 
large price changes but little trading, or vice versa? After documenting the existence of 
differential reactions, we provide initial evidence on the second question: Are these differential 
reactions related to announcement-specific characteristics? 

Our empirical investigation is based on a sample of price and volume reactions to 8,180 
quarterly earnings announcements by 1,079 firms, during the period 1986—1989. Consistent with 
prior research, we find that the magnitudes of price and volume reactions are positively related. 
However, our evidence suggests that this relation is weakerthan expected. Nearly a quarter of the 
earnings announcements generate either (1) very high trading but little price change, or (2) large 
price change but little trading. Evidence of such substantial differences between price and volume 
reactions suggests that trading volume-based research has the potential to yield insights beyond 
those attainable through price-based research. Further empirical analysis suggests that earnings 
announcements that generate high trading volume reaction (relative to price reaction) are 
associated with (1) divergent financial analysts (predisclosure) earnings forecasts; (2) large 
analyst followings; (3) large magnitude of random-walk-based unexpected earnings relative to 
analyst-based unexpected earnings; and (4) stock price increases. The study's empirical results 
are broadly consistent with the general intuition underlying much theoretical trading volume 
research: when an announcement generates differential belief revision, trading volume is likely 
to be high.? 


! If, forexample, an earnings announcement conveys "good news" relative to some investors’ prior expectations, but "bad 
news" relative to other investors’ expectations, the former are likely to buy from the latter, thereby increasing trading 
volume. The associated price change may not be significant, however, if investors' belief revisions are largely 
counterbalancing (since prices reflect an averaging of individual investors' beliefs). Conversely, even if an announce- 
ment causes a change in average beliefs which in turn induces dri change, trading volume may be low if investors 
have identical predisclosure expectations and interpretations of the announcement. 

2 See Karpoff (1986) or Kim and Verrecchia (1991a; 1991b). In particuler, Kirn and Verrecchia (19918; 1991b) model 
trading volume reaction to an informative disclosure as an increasing function of (1) differential belief revision among 
individual investors (arising from predisclosure information asymmetry, in their model), in addition to (2) the magnitude 
of the associated (absolute) price reaction. 
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These results have implications for researchers and policymakers who consider market 
reactions as evidence of a disclosure’s usefulness to investors. Our empirical evidences, that 
accounting announcements often generate either (1) high trading volume but little price move- 
ment, or (2) large price changes but little trading, suggests that it is important for policymakers 
and researchers to consider both price and volume reactions to accounting disclosures in order to 
avoid drawing unwarranted conclusions. For example, it would be premature to conclude that a 
disclosure is not used by individual investors based on evidence of a small or negligible price 
reaction only, because that disclosure could nevertheless have stimulated considerable trading 
among investors. 

The remainder of the paper is organized as follows. Section II provides a conceptual 
development in the context of prior research. Sample selection, data sources, and operational 
variable definitions are discussed in section III, while section IV presents statistical analyses and 
results. A brief summary and concluding comments appear in section V. 


II. CONCEPTUAL DEVELOPMENT 


Price changes reflect changes in the aggregate market's average beliefs, while in contrast, 
trading volume is the sum of all individual investors' trades, or actions (Beaver 1968; Bamber 
1987; Kim and Verrecchia 19912). Because price changes and trading volume capture fundamen- 
tally different aspects of the market's assimilation of information, earnings announcements 
generating heavy trading will not necessarily also induce large price changes. Thus, we begin by 
assessing the frequency with which earnings announcements generate heavy trading, but little 
price reaction, or vice versa. 

After establishing the existence of such differential reactions, the next question is whether 
these differential reactions are systematically related to announcement-specific characteristics. 
Ideally, we would employ a single, general theoretical model as a basis for identifying 
characteristics that are likely to be associated with differential price-volume reactions. Unfortu- 
nately, such a model does not exist currently. Extant theoretical research employs various models 
that differ significantly in their underlying assumptions.? Because each of the existing models is 
incomplete, there is no guarantee that the predictions of any particular model would continue to 
hold up in some as yet undiscovered more general (i.e., complete) model. Nevertheless, we 
believe that it is appropriate to appeal to existing theories to suggest empirical tests. 

Price reaction to a public disclosure reflects the average belief revision in the aggregate 
market resulting from the disclosure (Beaver 1968; Kim and Verrecchia 1991a, 1991b). In 
contrast, trading volume arises when individual investors make differential belief revisions 
(Karpotfi 1986; Kim and Verrecchia 1991a, 1991b). Thus, we expect to observe heavy trading 
relative to price reaction when an earnings announcement generates (1) differential belief 
revisions among individual investors, but (2) a small average aggregate market belief revision 
(because these differential individual belief revisions are largely offsetting), and conversely. 
Drawing upon prior theoretical and empirical research, we identify characteristics that are 
expected to be associated with differential price-volume reactions because of their relation to the 


* For example, some models allow differential priors, but require homogeneous interpretations (c.g., Kim and Verrecchia 
1991a; 1991b), other models allow heterogeneous interpretations, but require homogeneous priors (e.g., Kim and 
Verrecchia 1994), and private information acquisition is exogenous in most models, but is endogenous in Demski and 
Feltham (1994) and Kim and Verrecchia (1991b, 1994). The extant models examine the effects of relaxing different 
assumptions, so they yield different insights into how announcement-specific characteristics might be associated with 
differential price-volume reactions. 
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average aggregate market belief revision, and/or differences among individual investors’ belief 
revisions. The rationale for the directional expectations regarding each characteristic is discussed 
in turn. 

Prior analytical research suggests that trading volume is an increasing function of the degree 
of divergent predisclosure expectations (e.g:, Varian 1986; Karpoff 1986), and supporting 
empirical evidence appears in Ajinkya et al. (1991) and Atiase and Bamber (1994). Furthermore, 
Atiase and Bamber (1994) and Kross et al. (1994) present empirical evidence that trading volume 
reaction to earnings announcements is an increasing function of the divergence in analysts 
forecasts, even after controlling for the magnitude of the price reaction. This result suggests that 
even if an earnings announcement generates a small price reaction, it may nevertheless stimulate 
considerable trading if there is wide dispersion in analysts forecasts. Earnings announcements 
that generate heavy trading relative to price reaction may therefore be associated with divergent 
analysts forecasts. In a related vein, Kim and Verrecchia (1991b) suggest that the magnitude of 
trading volume relative to price reaction is an increasing function of predisclosure information 
asymmetry. Atiase and Bamber (1994) argue that divergence in analysts forecasts reflects the 
unobservable predisclosure information asymmetry construct, which is consistent with our 
expectation that the magnitude of trading relative to price reaction is an increasing function of the 
divergence in analysts forecasts. 

We also expect differential price and volume reactions to be associated with firm size and 
analyst following. Atiase (1980) argues that firm size is positively related to investors' incentives 
to acquire private predisclosure information, and similarly, Dempsey (1989) concludes that the 
number of analysts forecasting a particular earnings announcement is an increasing function of 
net benefits to the production of private information. These studies suggest that firm size and 
analyst following reflect general incentives for investors, analysts, and other market participants 
to acquire private predisclosure information. We expect this aspect of the predisclosure informa- 
tion environment to have a differential effect on price versus trading volume reactions to earnings 
announcements, because private predisclosure information has two effects on the market's 
reaction to announcements. First, private predisclosure information reduces the "surprise" 
conveyed to the market by the earnings announcement (e.g., Atiase 1985), which in turn 
decreases the magnitude of both price and volume reactions. Second, provided that individual 
investors differ in the private predisclosure information to which they are privy, investors will 
likely form heterogeneous predisclosure expectations and/or heterogeneous interpretations of the 
earnings announcement. Heterogeneity in either expectations or interpretations can stimulate 
trading (Karpoff 1986). Thus, differential private predisclosure information is likely to have two 
counteracting effects with respect to trading volume: (1) reduction of surprise, which tends to 
reduce trading volume, and (2) heterogeneous belief revisions pursuant to the earnings announce- 
ment, which tend to increase trading volume. No such counteracting effect exists for price 
reaction, so we expect earnings announcements that generate large trading volume reactions, but 
small price reactions, to be associated with larger firms or larger analyst following. 

Prior research suggests that price and volume reactions are likely to exhibit differential 
sensitivity to analyst forecast-based vis-a-vis random-walk-based unexpected earnings. When 
earnings announcements from all four fiscal quarters are pooled together in a single analysis, 
Brown et al. (1987) and Wiedman (1990) conclude that abnormal returns are more closely 
associated with analyst-based unexpected earnings than with unexpected earnings based on a 
seasonal random-walk model. Similarly, in their analysis of annual earnings announcements, 
Hopwood and McKeown (1990) find that abnormal returns are more closely associated with 
analyst-based unexpected earnings than with simple random-walk-based unexpected earnings. In 
contrast with these studies, Hughes and Ricks (1987) investigate fourth quarter earnings 
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announcements that are immediately preceded by a Standard and Poor's analyst forecast in the 
week before the earnings announcement. Hughes and Ricks (1987) conclude that abnormal 
returns around these earnings announcement dates are slightly more closely associated with 
seasonal random-walk-based unexpected earnings than with analysts-based unexpected earn- 
ings. Although these studies’ conclusions are not in complete agreement, the preponderance of — 
the evidence suggests that price reactions are likely to be more closely associated with analyst- 
based unexpected earnings than with (seasonal) random-walk-based unexpected earnings.* This 
is not surprising since analysts forecast earnings more accurately than do random-walk models 
(e.g., Bamber 1987; Hopwood and McKeown 1990). On the other hand, Bamber (1986) suggests 
that trading volume is more closely associated with random-walk-based unexpected earnings 
than with analyst-based unexpected earnings. While significant differences among these studies' 
methods preclude direct comparisons,° this prior empirical research suggests that trading volume 
tends to reflect random-walk-based unexpected earnings, while price reactions tend to reflect 
analysts-based unexpected earnings. Consequently, we expect that the higher (absolute) random- 
walk-based unexpected earnings are relative to (absolute) analysts-based unexpected earnings, 
the higher trading volume is likely to be (following the larger random-walk unexpected earnings) 
relative to price reaction (following the smaller analyst-based unexpected earnings), and 
conversely. 

Recent research in accounüme and finance suggests that this empirically motivated expecta- 
tion is not necessarily anomalous. Empirical evidence is consistent with the market forming 
expectations that at least partially reflect a random-walk expectation (Bernard 1991), suggesting 
that some individual investors form random-walk expectations. Clearly, random-walk-based 
unexpected earnings is an outdated measure of earnings surprise, relative to analyst forecast- 
based unexpected earnings, and Black (1986) suggests that investors who trade on pseudo-signals 
such as outdated “information” constitute one class of noise traders. Research on so-called “noise 
traders" is a currently developing paradigm, but at this point the general conclusion appears to be 
that either (1) noise traders do not affect prices (e.g., Figlewski 1979), or (2) noise traders could 
under some circumstances affect prices, but any effects of these noise traders (on prices) are, at 
least to some extent, mitigated by the actions of better informed traders who drive prices toward 
fundamentals (e.g., Black 1986; De Long et al. 1990a).’* Thus, any pressure on prices of investors 
acting on (outdated) random-walk measures of earnings surprise is likely to be largely “cancelled 
out" by better informed traders (e.g., those who act on analyst forecast-based earnings surprise). 
In contrast, the trades of all investors are fully included in the summation process that establishes 
trading volume. These arguments suggest that investors who trade based on random-walk 
earnings surprise are likely to affect trading volume relatively more than they affect prices, 


* Brown et al. (1987), Wiedman (1990), and Hopwood and McKeown (1990) are all based on extensive samples (4,177 
quarterly announcements; 3,487 quarterly announcements; and annual announcements made by 258 firms over five 
years, respectively), while the Hughes and Ricks (1987) study is confined to 677 fourth quarter earnings announcements 
that are immediately preceded by a Standard and Poor’s analyst forecast. One possible explanation frie Hughes and 
Ricks result is that this analyst forecast may not have been “pablic information” if it was not widely disseminated at low 
cost before the earnings announcement date. 

5 None of the studies examine both prices and trading volume, and there are also significant differences among the studies’ 
variable definitions, sources of analysts forecasts, time periods, and sample firms. 

* Hakansson (1977) explains why some (rational) investors do not find it cost-effective to acquire privare predisclosure 
information (e.g., forecasts of analysts included in the /B/E/S compendium), or even to assimilate publicly disclosed 
information if this information was available privately to select investors in the predisclosure period. 

? In a similar vein, results from experimental markets research suggest that prices in markets that include experienced 
traders are less biased than prices in markets with only inexperienced traders (Camerer 1987). 

* The weight of existing research is consistent with informed traders mitigating (partially, if not completely) any effects 
of noise traders on prices, although there are exceptions (e.g., DeLong et al. 1990b). 
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consistent with our expectation that the higher random-walk-based unexpected earnings relative 
to analyst-based unexpected earnings, the higher trading volume reaction relative to price 
reaction. 

The final characteristic we investigate is the direction of the price change associated with the 
earnings announcement. Empirical research in finance documents that trading volume is higher, 
on average, for price upticks than downticks. Karpoff (1987) suggests that this asymmetric 
relation is due to institutional short sales regulations that constrain some investors from fully 
acting on bad news information that decreases their demand for the firm’s shares. Accordingly, 
we expect earnings announcements that generate heavy trading volume relative to price reactions 
to be associated with price increases, and vice versa.’ 


III. RESEARCH DESIGN 


Sample Design and Data Collection 


The study examines market reactions to earnings for fiscal quarters between 1986 and 1988, 
inclusive, which are announced between 1986 and 1989.!? The sample observations meet four 
selection criteria: (1) a quarterly earnings announcement date is available from COMPUSTAT 
(PST or Research file), (2) returns, excess returns, and trading volume data are available from (at 
least) the day before to five days after the earnings announcement, from the CRSP NYSE/AMEX 
files, (3) analyst forecast data are available from the Znstitutional Brokers Estimate System 
(I/B/E/S) tapes, and (4) the firm has a December 31 fiscal year-end.!! These criteria yielded a 
sample of 8,180 quarterly earnings announcements by 1,079 firms. 


Volume Metrics 


Two different trading volume measures are employed. The first measure subtracts, from firm 
i’s daily percentage of shares traded, the percentage of shares traded on the NYSE." The resulting 
daily market-adjusted trading volume for each firm is cumulated over two different length test 
periods: 
1) a two-day period—from day -1 to day 0, relative to the COMPUSTAT earnings 
announcement date, and 
2) a seven-day period—from day -1 to day +5. 


9 This aspect of our study differs in two ways from prior research in finance. First, the research reviewed in Karpoff (1987) 
focuses on the general price change-trading volume relation without regard to particular disclosures of firm-specific 
information. In contrast, we examine the relation between price and trading volume reactions to firm-specific earnings 
announcements. Second, most of the prior studies in finance documented either (1) a positive relation between trading 
volume and the absolute value of the price change, or (2) higher trading volume coincident with upticks than with 
downticks. However, our research design examines both simultaneously. 

10 A previous version of this study investigated reactions to annual earnings announcements made from 1980 to 1989. The 
inferences from that analysis are qualitatively similar to the results reported here, which are based on quarterly earnings 
announcements. In that previous analysis, 20—25 percent of the annual earnings announcements generated differential 
price and volume reactions. Moreover, high volume-low price reactions were associated with (1) more disagreement 
among analysts, (2) a larger analyst following, (3) higher random-walk unexpected earnings, and (4) good news. 

Also, restricting our analyses to interim (i.e., first, second, and third quarter) announcements yields results similar 
to those reported here. 

I The December 31 fiscal year-end criterion facilitates matching COMPUSTAT earnings announcement dates with the 
last VB/E/S forecast associated with that earnings announcement. This sample selection criterion is often used in 
research that employs analysts forecasts (e.g., Hughes and Ricks 1987; O'Brien 1988; Hopwood and McKeown 1990). 

2 Daily total shares traded on the NYSE are obtained from Standard & Poor's Daily Stock Price Record, and are divided 
by the total shares outstanding on the NYSE to yield the percentage of NYSE shares traded. Total shares outstanding 
on the NYSE are obtained from Standard & Poor's Statistical Service-Basic Statistics. 
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These metrics are hereafter denoted VOL2MKT and VOL7MKT, for the two and seven day event 
windows, respectively. 

The second metric controls for cross-sectional differences in firm-specific average non- 
announcement period trading by scaling (dividing) firm i's announcement period percentage of 
shares traded by the firm's non-announcement period volume.” More specifically, we divide the 
percentage of firm i's shares traded in the two (seven) day announcement period by the firm- 
specific median non-announcement period volume (i.e., the median of: the percentage of the 
firm's shares traded, cumulated over contiguous two (seven) day non-announcement period 
intervals during the earnings announcement year). These metrics can be interpreted as the ratio 
of announcement-period trading volume relative to the firm's median non-announcement trading 
volume, and they are denoted VOL2PCT and VOL7PCT, respectively. 


Price Metrics 


We employ two sets of return metrics that are constructed to parallel the two sets of trading 
volume metrics. Market-adjusted returns are based on CRSP beta excess returns, which subtract 
from firm i's return the contemporaneous return on a matched-beta portfolio. We cumulate these 
excess returns over the two (seven) day event windows. Because we are interested in the 
magnitude rather than the direction of the price reaction, we use the absolute value of the two 
(seven) day cumulative excess returns. These metrics are denoted RET2MKT and RET7MKT. 

The second setof absolute return metrics parallels the construction ofthe second set of trading 
volume metrics by controlling for cross-sectional differences in firm-specific average non- 
announcement absolute returns. We use the absolute value of the two (seven) day announcement 
period cumulative return, divided by the firm-specific median of the two (seven) day non- 
announcement period absolute returns. The divisor is the median of: the absolute values of the 
cumulated two (seven) day non-announcement period returns, cumulated over contiguous two 
(seven) day non-announcement period intervals during the earnings announcement year. These 
metrics, denoted RET2PCT (RET7PCT), can be interpreted as the ratio of the absolute return in 
the announcement period, relative to the firm's median non-announcement period absolute 
return. 


Firm-Specific Characteristics 


Section IL explained the rationale for expecting the following characteristics to be associated 
with differential price versus volume reactions: 

1) Dispersion in analysts forecasts - DISP, 

2) Range in analysts forecasts - RANGE, 

3) Number of financial analysts providing earnings forecasts to I/B/E/S - NOA,, 

4) Firm size - MKTVALUE,, 

5) Difference between random-walk (absolute) unexpected earnings and analysts- 

forecast-based (absolute) unexpected earnings - - UEDIFF,, 


3The non-announcement period is defined as all trading days in the earnings announcement year, excluding 21-day 
windows centered on the earnings announcement dates. 
“Dividing announcement period absolute returns by the firm-specific median non-announcement level of absolute 
returns expresses the resulting announcement period metrics (RET2PCT, RET7PCT) as a percentage of their non- 
announcement values. Since we adopt an analogous approach for the trading volume metrics (VOL2PCT, VOL7PCT), 
these absolute return and volume metrics are denominated in comparable units of measure. Both express the magnitude 
of the earnings announcement reaction as a percentage of the analogous firm-specific non-announcement period values. 
However, even in "non-earnings-announcement" periods, other firm-specific information releases generate market 
reactions. If the resulting trading volume and absolute returns are similar to those generated by earnings announcements, 
metrics that control for firm-specific average volume and absolute returns are also likely to abstract from part of the 
earnings announcement effect of interest. 
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6) Direction of associated price movement - PRICEUP,, 

The first two characteristics reflect the heterogeneity of predisclosure expectations. These 
two characteristics are based on the divergence in analysts’ earnings expectations. Dispersion 
(DISP, " is the coefficient of variation in analysts' forecasts of quarter q EPS for firmi. RANG 
is the difference between the most optimistic and most pessimistic analyst forecast of quarter q 
EPS for firm i, scaled by the absolute value of the mean forecast of quarter q earnings. DISP, and 
RANGE, are based on analysts’ forecasts in the last month that J/B/E/S forecasts quarter q 
earnings. We require four or more analysts' EPS forecasts to compute DISP, or RANGE, , and 
we exclude from the analysis announcements with mean EPS forecasts between —$.20 an $. 20 
due to the metrics' sensitivity to small denominators. (Analysis involving DISP, or RANGE, is 
based on the 4,719 announcements that meet these additional criteria.) 

The next two characteristics reflect incentives for market participants to acquire private 
predisclosure information. NOA, is the number of financial analysts included in the mean 
forecast during the last month that forecasts quarter q earnings for firmi. MKTVAL 
is the market value of firm i's common shares outstanding at the beginning of quarter q. 

As explained earlier, prior research suggests that trading volume tends to track random-walk- 
based unexpected earnings (UERW), while price reaction tends to track analyst forecast-based 
unexpected earnings (UEA). Since we are interested in the difference between trading volume and 
(absolute) price reactions for a particular earnings announcement, we define UEDIFF as the 
difference between the absolute (percentage) forecast errors of random-walk vis-a-vis analyst 
forecast-based earnings expectations: 

UEDIFF, = UERW, - UEA,, 
where: 
UERW, 


absolute percentage forecast error from a seasonal random-walk expectation 
model, for firm i's quarter q, and 

UEA, = absolute percentage forecast error of the mean analyst forecast, 

for firm i's quarter q.” 

Finally, research in finance suggests that the magnitude of trading volume relative to price 
movements is likely to depend on whether the announcement is associated with a price increase 
or decrease. Thus, we define PRICEUP,, as one if the two day (raw) return associated with firm 
i's quarter q earnings announcement is positive, and zero otherwise. 


IV. ANALYSES AND RESULTS 


Frequency and Magnitude of Differential Price and Volume Reactions 


Although prior research has focussed on documenting similarities between price and volume 
reactions to earnings announcements, the potential for empirical trading volume-based research 
to provide new insights beyond price-based research depends on the extent to which trading 
volume reactions behave differently than price reactions. To document the extent to which 
earnings announcements generate price and volume reactions of different magnitudes, we 
perform a contingency table analysis. We begin by classifying the reaction to each earnings 


I5'The absolute percentage forecast errors are defined relative to their respective expectations, and cach component 
forecast error (UERW, UEA) is winsorized at 97 percent. 
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announcement into a price reaction decile and a trading volume reaction decile. “Similar” 
reactions are those for which the magnitudes of the volume and price reactions are similar—the 
absolute value of the difference between the price and volume reaction deciles is less than or equal 
to two. “Different” reactions are defined as those for which the absolute value of the difference 
between the price and volume reaction deciles is five or more (e.g., for a decile four price reaction, 
the volume reaction decile must be nine orten). The “Different” reactions are further subclassified 
into eitherthe "Large Volume-Small Price reaction" orthe "Small Volume-Large Price reaction" 
categories. The remaining reactions—those for which the absolute value of the difference 
between the price and volume deciles is greater than two but less than five—are classified as 
"Indeterminate" reactions. 

We employ two null hypotheses: (1) price and volume reactions are independent, and (2) 
price and volume reactions are closely related. Under the null hypothesis of independence 
between price and volume reactions, an equal number of announcements would be expected to 
fall in each of the 100 cells in the contingency table. Given the above category definitions, 44 
percent of the reactions would be expected to fall into the Similar category, 15 percent would fal] 
in the Large Volume-Small Price reaction category, 15 percent would fall in the Small Volume- 
Large Price reaction category, and the remaining 26 percent would be expected to fall in the 
Indeterminate category. 

In contrast, our second null hypothesis is that price and volume reactions are sufficiently 
closely related that all of the reactions fall into the Similar category. This null hypothesizes that 
price and volume reactions are sufficiently closely related that the trading volume reaction decile 
always falls within two deciles of the price reaction decile. For example, an earnings announce- 
ment that generates a decile three price reaction would be classified in the Similar category if the 
associated trading volume reaction falls anywhere from decile one to decile five inclusive— 
anywhere in the entire lower half of the trading volume reaction distribution. Consequently, this 
null hypothesis that all of the reactions fall into the Similar category is a much weaker condition 
than a hypothesis that price and volume are perfectly correlated. 

The contingency tables tabulate the frequency of earnings announcements that fall in each 
ofthe volume-price reaction decile cells. Since the analysis is based on 8,180 announcements and 
100 cells, the expected frequency in each cell is 81.8 under the null hypothesis of independence. 
Table 1 reports the VOL2MKT by RET2MKT contingency table results. The summary at the 
bottom of table 1 reports the expected percentage of reactions (i.e., expected frequency) in each 
of the four categories (i.e., Similar, Indeterminate, Large Volume-Small Price reaction, and Small 
Volume-Large Price reaction), under the null hypothesis of independence. This is followed by the 
frequencies actually observed in the sample. À family of 99 percent Bonferroni confidence 
intervals (Neter and Wasserman 1974) is also reported at the bottom of table 1.15 Table 2 presents 
analogous summary statistics from three additional contingency tables: VOL7MKT by RET7MKT, 
VOL2PCT by RET2PCT, and VOL7PCT by RET7PCT. 

The results are quite similar across tables 1 and 2. Under a null hypothesis of independence, 
44 percent of the reactions would be expected to fall in the Similar category. The actual 


16 To make inferences regarding the quadruple of frequencies (Similar, Indeterminate, Large Volume-Small Price, Small 
Volume-Large Price), we compute Bonferroni confidence intervals (Neter and Wasserman 1974) such that the family 

' confidence coefficient for the quadruple of confidence intervals is 99 percent. If repeated samples were selected and 
Bonferroni confidence intervals were estimated for each of the four frequencies, then each of the four true (unknown) 
population frequencies would lie within the respective confidence interval in 99 percent of the samples. For only one 
percent of the samples would one or more of the true (unknown) population frequencies lie outside the respective 
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frequencies of Similar reactions are 8-11 percent higher than would be expected under the null 
hypothesis of independence, and the actual frequencies range from 51.54% (VOL7MKT by 
RET7MKT) to 54.95% (VOL2PCT by RET2PCT). Although the 99 percent Bonferroni confi- 
dence intervals for the “Similar” frequencies lie above 44 percent, only about half of the earnings 
announcements generate price and volume reactions that are similar in magnitude. While the 
magnitudes of price and volume reactions to earnings announcements are significantly positively 
associated, this association is probably weaker than is generally believed." The proportion of 
reactions classified as Similar (253 percent) is much closer to the 44 percent that would be 
expected underindependence thanto the 100 percent that would be expected under strong positive 
association. 

The proportion of reactions falling in the Indeterminate cells is similar to what would be 
expected under the null of independence. For the Indeterminate reactions, the actual frequencies 
range from 24.49% (VOL7MKT by RET7MKT) to 25.76% (VOL7PCT by RET7PCT), very 
close to the expectation under the null of independence (26 percent). 

Under the null hypothesis of independence, we would expect 15 percent of the reactions to 
fall into the Large Volume-Small Price reaction category, and another 15 percent to be classified 
in the Small Volume-Large Price reaction category. However, the actual frequencies are lower. 
In the Large Volume-Small Price reaction category, they range from 10.77% (VOL2PCT by 
RET2PCT) to 12.03% (VOL7MKT by RET7MKT). Similarly, the actual frequencies in the 
Small Volume-Large Price reaction category range from 9.14% (VOL2PCT by RET2PCT) to 
11.94% (VOL7MKT by RET7MKT).? In all these comparisons, the confidence intervals lie 
below 15 percent—but well above 0 percent. Since these two “Different” categories include only 
those announcements where trading volume and abnormal return reactions are at least five deciles 
apart, these results indicate that the relative magnitudes of price and volume reactions are very 
different for 20-24 percent of the sample earnings announcements. This result suggests that 
trading volume-based research has the potential to yield insights beyond those attainable through 
price-based research. 


Association Between Characteristics and Differential Price- Volume Reactions: 
Univariate Analyses 


Section H lists several characteristics that we expect to be associated with differential price- 
trading volume reactions. While prior research has investigated univariate relations between 
some of these characteristics and either price or volume, the relation between each of these 
characteristics and differential price-volume reactions has not been explored. Hence, we first 


17 Researchers often assume that price and volume reactions are closely related. For example, consider Lev's (1989, 156) 
statement: 
The stock price change is, of course, a restricted indicator of information usefulness, since in a 
heterogeneous belief setting, investors might use the information without the price being changed. 
Volume of trading is a more sensitive indicator of information usefulness. In reality, however, price and 
volume changes are, in general, highly correlated (emphasis added). 
The evidence provided in tables 1 and 2, and our finding (not shown) that the correlation between price and volume 
reactions in our sample ranges between .2 and .3, are inconsistent with Lev's conjecture that price and volume changes 
are "highly correlated." 
" Qualitatively similar inferences arise from other definitions of “Similar” and "Different" categories. 
19We examined whether a particular firm's earnings announcements tend to fall into the same differential reaction 
category, quarter after quarter. For earnings announcements that are classified in either the Large Volume-Small Price 
or the Small Volume-Large Price categories, in 80-90 percent of the cases (depending on the metrics), the firm's 
subsequent quarter (q+1) earnings announcement does not fall into that same differential reaction category. For 95 
percent-99.9% of the quarter q announcements that are classified in either the Large Volurme-Small Price or Small 
Volume-Large Price category, at least one of the next two announcements does not fall into that same category. 
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present evidence on the univariate relations, and the next section presents results of a multivariate 
analysis. 

The purpose of this univariate analysis is to examine characteristics associated with those 
earnings announcements that generate price and volume reactions of extremely different 
magnitudes. Therefore, table 3 reports the median values of each characteristic (e.g., DISP) for 
those earnings announcements in the Large Volume-Small Price reaction category and for those 
announcements in the Small Volume-Large Price reaction category.” The difference between the 
values of each characteristic across the two differential reaction categories appears in the 
rightmost column.?! 

Results of the univariate comparisons are consistent with the expectations articulated in section 
IL Earnings announcements that generate Large Volume-Small Price reactions are associated with 
significantly more divergent predisclosure analysts EPS forecasts (i.e., higher DISP and RANGE) 
than are announcements that generate Small Volume-Large Price reactions (a < .05).? 

Asexpected, earnings announcements that generate Large Volume-Small Price reactions are 
forecast by significantly more analysts than are earnings announcements that generate Small 
Volume-Large Price reactions (a < .009). In contrast to the analyst following-based results, 
however, inferences from MKTV ALUE depend on the metric. Results based on market-adjusted 
metrics suggest that earnings announcements generating high volume relative to price reaction 
are on average made by large firms, and conversely. However, the statistical significance of this 
difference disappears when VOLZ2PCT/RET2PCT are employed. The likely explanation for this 
difference follows from Arbel and Strebel’s (1982) suggestion that there is less intertemporal 
variation in firm size than in analyst following. To the extent that size is a firm-specific 
characteristic with limited time-series variability, MKTV ALUE reflects the difference in the 
firm's average volume and price movement. Consequently, when we control for the firm's 
average volume and absolute returns via VOL2PCT/RET2PCT, the significance of the relation 
between MKTV ALUE and differential volume versus price reaction evaporates.” 


We report results based on 2-day event windows because prior research has shown that most of the price and volume 
reaction occurs on days -1 and 0. However, since unusual trading can persist up to five days after earnings 
announcements (Morse 1981; Bamber 1987), we also investigated a 7-day window that spans days -1 to +5. This analysis 
generally yields similar inferences, so the 7-day window results are not discussed further unless differences occur. We 
also repeated the analyses based on a "combination" 2-day return window and 7-day volume window. This analysis 
yields inferences similar to those reported here—the results nearly always fall in-between the results of the pure 2-day 
and the pure 7-day analyses. 

2! Table 3 employs the nonparametric Wilcoxon rank sum procedure to assess the statistical significance of the difference 
in values of each characteristic across the two differential volume-price reaction categories, because the distributions 
of several of these characteristics (especially DISP, RANGE, and MKTVALUBE) are skewed. 

Z'The DISP and RANGE analysis presented in table 3 extends that reported in Atiase and Bamber (1994) in several ways. 
Our paper assesses the extent to which earnings announcements generate differential volume-price reactions, and then 
investigates whether these differential reactions are associated with announcement-specific characteristics (e.g., 
divergence in analysts forecasts). In contrast, Atiese and Bamber (1994) do not provide evidence on differential 
reactions. Futhermore, we rank price and volume reactions into deciles, which forms a common basis or scale for 
comparing price versus volume reactions, and our categorical statistical analyses do not require linear relations among 
the variables. 

3 VOL2PCT (RET2PCT) is scaled by firm-specific median non-announcement period volume (absolute returns). These 
measures endogenously control for non-announcement volume and absolute returns, so their results reflect earnings 
announcement-induced phenomena. However, the market-adjusted metrics do not control for firm-specific non- 
announcement average volume and returns, so we repeated the univariate analysis in table 3 employing differential 
markes-adjusted volume-price movements at pseudo-event dates to establish benchmark nonannouncement (market- 
adjusted) volume and price activity. This analysis suggests that inferences from the univariate analysis reported in table 
3 apply to announcement-induced phenomena, except that the relation between (1) NOA and MKTVALUE and (2) 
differential volume relative to price movements (measured via the market-adjusted metrics) appears to be largely due 
to general "non-announcement" relations. This is nct necessarily surprising. Firms constantly make public disclosures 
in addition to earnings announcements, and if NOA and MKTVALUE proxy for general incentives to acquire private 
predisclosure information, the resulting reduced surprise and increased differential belief revision is also likely to 
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Table 3 also shows that earnings announcements generating Large Volume-Small Price 
reactions are associated with higher UEDIFF (a < .002) than are announcements generating 
Small Volume-Large Price reactions, as expected.” The higher the (absolute) random-walk- 
based unexpected earnings relative to (absolute) analyst-based unexpected earnings, the higher 
trading volume reaction relative to price reaction. 

PRICEUP® in table 3 shows the percentage of earnings announcements that are associated 
with price increases. As expected, announcements in the Large Volume-Small Price category are 
more likely to be associated with price increases than are earnings announcements in the Small 
Volume-Large Price category (œ < .0001).* In sum, the univariate comparisons suggest that 
earnings announcements generating high trading volume relative to price reactions are on average 
associated with (1) more divergent analyst EPS forecasts; (2) larger analyst following; (3) high 
random-walk-based unexpected earnings relative to analyst-based unexpected earnings; and (4) 
price increases.” 


Multivariate Analysis 


Since many of the characteristics are related statistically as well as conceptually, we employ 
a multivariate analysis to ascertain the extent to which these characteristics capture distinct (non- 
overlapping) factors associated with differential price and volume reactions. The analysis 
includes DISP, NOA, MKTVALUE, UEDIFF, and PRICEUP. RANGE is dropped due to its 
redundancy with DISP.?’ 

We employ an ordered response logit model that incorporates information embedded in the 
ordering of the dependent variable and is appropriate when the dependent variable’s values are 


(Footnote 23 continued) 

generate high trading relative to price reaction for these other announcements. Nevertheless, evidence that NOA is still 
significant in table 3 using VOL2PCT/RET2PCT indicates that NOA is also significantly associated with announce- 
ment-induced differential reactions. 

* Consistent with prior evidence that random-walk forecasts are, on average, less accurate than analysts’ forecasts (e.g., 
Bamber 1987; Hopwood and McKeown 1990), table 3 shows that UEDIFF is positive (i.e., random-walk forecast error 
exceeds analysts' forecast error) for both categories of earnings announcements. However, the issue of interest in this 
study is whether UEDIFF is greater for earnings announcements that generate Large Volume-Small Price reactions than 
for those that generate Smali Volume-Large Price reactions. Results reported in table 3 confirm this expectation. 

>To investigate whether the PRICEUP relation documented in table 3 captures more than a non-announcement effect, 
we estimated the following model: 


VOL = à + b | PR | +ĉ[(PRICEUP) * (| PR 1] + d(EVENT) * (PRICEUP) * (| PR D] 


where: 

VOL = trading volume reaction (VOL2MKT, VOL2PCT), 

| PRI = absolute price reaction (RET2MKT, RET2PCT), 

PRICEUP = 1 ifthe contemporaneous two-day return is positive, and 0 otherwise, 

EVENT = 1ifanearnings announcement period (i.e., days ^1 to 0) and O otherwise. (Non-announcement periods 


are defined as eight 2-day periods around each announcement—days —11 and —10; days —9 and —8; 
days —7 and —6; days —5 and —4; days +4 and +5; days +6 and +7; days +8 and +9; and days +10 and 
+11.) 


VOL and | PR | are winsorized at 97 percent to mitigate the effects of extreme values. We found that b, ĉ and d are all 
significantly positive. In this model: (1) finding b significantly 20 means that trading volume is an increasing function 
of the magnitude of the contemporaneous price reaction, (2) finding ¢ significantly >0 suggests that trading volume 
is higher for (contemporaneous) upticks than downticks, and (3) finding d significantly 20 reveals the "uptick 
increment" is higher, on average, in announcement periods than in non-announcement periods. Evidence that d is 
significantly greater than zero indicates that our analysis captures more than just the non-announcement period relation 
between the direction of the price change and differential price and volume reactions. 

* Analysis based on 7-day event windows yields similar inferences, except that the difference between Large Volume- 
Small Price and Small Volume-Large Price categories is not significant for VOL7PCT/RET7PCT with (1) UEDIFF 
(a = .24), and (2) PRICEUP (a = .13). 

?'Including RANGE instead of DISP yields results that are qualitatively similar to those reported in table 4. 
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categorical, but of ordinal scale. The values of our dependent variable are three categories of 
volume-price reactions, ordered according to the magnitude of volume reaction relative to price 
reaction:?* 

Large Volume-Small Price reaction — 1 

Similar Volume and Price reaction = 2 

Small Volume-Large Price reaction = 3 
(We obtain qualitatively similar results when we restrict the analysis to the Large Volume-Small 
Price versus the Small Volume-Large Price categories, and use ordinary logit analysis.) 

The ordered response logit model fits the following function: 


g(Pr(CAT «il X) 2 a, - BX (1) 
where 
CAT - category value assumed by dependent variable, 
X = vector of independent variables, 


1<i<k-1, with k = 3, the number of values the dependent variable assumes. 

Equation 1 fits the probability that a reaction is from (ordered) category i or lower, given the 
Observed vector of explanatory variables. In the following analysis, a positive logit coefficient 
means that higher values of the independent variable are associated with higher trading volume 
(relative to price) reactions. 

Cheng et al. (1992) suggest that analysis based on relative ranks is more robust than standard 
parametric linear regression. Hence, we estimate the logit models using the standardized ranks 
of the independent variables’ values. (The dependent variables’ categorical classification is 
already based on relative ranks, i.e., deciles.) Following Cheng et al. (1992), we divide each 


N 
independent variable’s rank by N+1, so that each ranked variable has a maximum value of Nel’ 


1 | 
and a minimum value of ——— , where N = number of observations in the model. Cheng et al. 


(1992) point out that this procedure yields coefficients that are independent of the number of 
observations.” 

Table 4 presents the ordered response logit model coefficients and their. corresponding 
significance levels. This multivariate analysis confirms the univariate results reported in table 3. 
As expected, more divergent analysts forecasts (i.e., higher DISP) are associated with higher 
trading volume relative to price reaction (a < .05). Earnings announcements that have been 
forecasted by many analysts are more likely to generate higher trading volume relative to price 
reaction (a « .01). However, MK TV ALUE is generally nota significant explanator of differential 
volume-price reactions when NOA is also included in the logit model.” Finding that NOA 
dominates MKTVALUE in our logit analysis is consistent with Dempsey' s (1989) evidence that 


2 The “Indeterminate” reactions are excluded from the logit analysis. This reduces the error in assigning the dependent 
variable into categories. 

2 [nferences from logits based on raw (unstandardized) ranks are identical to those based on relative ranks that are reported 

- here. Inferences from logits based on cardinal values of the independent variables (rather than ranks) are also quite 
similar to those reported here, with one exception. In the logits based on cardinal values, the UEDIFF coefficients are 
always positive, but are significant only for the market-adjusted metrics (a = .15 for VOL2PCT/RET2PCT). 

*When MKTVALUE is included in a logit analysis without NOA, the inferences are identical to those based on the 
univariate analysis: the MKTVALUE coefficient is significantly positive only for the market-adjusted metrics. 
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firm size has no incremental explanatory power (for abnormal returns), after controlling for the 
number of analysts following the firm.*! 

As expected, the UEDIFF coefficients are positive (œ < .03), meaning that the higher 
(absolute) random-walk unexpected earnings are relative to analyst-based unexpected earnings, 
the higher volume is relative to price reaction. Finally, PRICEUP is associated with higher trading 
volume relative to price reaction (œ < .002). In sum, the ordered response logit model results 
support the inferences drawn from the univariate analysis, as well as the expectations articulated 
in section IL?? 

Up to this point, the analysis (particularly table 3) has focused on those earnings announce- 
ments that generate very different magnitudes of price versus volume reactions, and we have 
presented evidence on the relation between such differential reactions and announcement- 
specific characteristics. The results suggest (but do not directly test) the notion that understanding 
the characteristics of a particular sample of announcements may be helpful in anticipating 
whether a study's inferences are likely to be sensitive to the choice between volume-based vs. 
price-based analyses. For example, if a sample of announcements is characterized by high DISP, 
RANGE, NOA, MKTVALUE, UEDIFF, or PRICEUP, are we likely to find a higher trading 
volume reaction than price reaction? The analysis in the next section addresses this question. 


Do Announcement-Specific Characteristics Anticipate Differences in Volume-Price 
Reactions?” 


We begin by partitioning the sample earnings announcements into quartiles for each of the 
six characteristics (DISP, RANGE, NOA, MK TVALUE, UEDIFF, and PRICEUP), and then 
restrict the subsequent analysis (pertaining to each characteristic) to announcements falling in the 
highest or lowest quartile of that characteristic.“ Since we are interested in whether trading 
volume reaction is higher than the associated price reaction for the highest quartile of each 
characteristic, and vice versa, we subtract the price reaction decile from the volume reaction 
decile. Positive differences indicate higher trading relative to price reaction, and conversely. 
Given our previous results, we expect to find a higher volume reaction than price reaction (i.e., ` 
positive differences) for the highest quartiles of DISP, RANGE, NOA, UEDIFT, and PRICEUP, 
and a higher price reaction than volume reaction (i.e., negative differences) for the lowest 
quartiles. 

Table 5 presents the cross-sectional mean difference between the volume reaction decile and 
the associated price reaction decile, and the significance of this difference (per matched-pair t 


Q 

*! Dempsey (1989) suggests that when analysts decide which firms to follow (andin turn, for which firms they will produce 
information), they consider several factors, of which firm size is only one. Moreover, Arbel and Strebel (1982) suggest 
that while size is a relatively enduring characteristic of firms, there is more inter-temporal variation in analysts 
following. Thus, it is not surprising that NOA dominates MKTVALUR in explaining market reactions to earnings 
announcements. 

31 For the 7-day event window, the inferences are similar for VOL7MKT/RET7MKT, but for VOL7PCT/RET7PCT only 
the NOA coefficient is significantly positive. Furthermore, an earlier version of this study reported results based on 
unadjusted trading volume (simple percentage of shares traded) and unadjusted absolute returns. These results are not 
reported because they are similar to (and generally stronger than) the results reported here. 

?*We thank an anonymous reviewer for suggesting the analysis in this section. 

* Tn contrast to the analysis in table 3 which examines those earnings announcements falling into either the Large Volume- 
Small Price or the Small Volume-Large Price categories, the analysis reported in table 5 regarding each individual 
characteristic investigates carnings announcements that are associated with the highest and lowest quartiles of that 
particular characteristic. (Since PRICEUP is a binary variable, all announcements associated with price decreases are 
classified in the lowest PRICEUP category (“quartile”), and all announcements associated with price increases are 
classified in the highest PRICEUP category ("quartile")). 
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TABLE 4 


Ordered Response Logit Analysis of Trading Volume-Absolute Price Reaction 
Categories on Selected Characteristics* 








Volume/Price Metric? 

DISP: NOA¢ MKTVALUE* UEDIFF/ PRICEUP: 
VOL2MKT!/ .475 .603 -.034 .156 214 
RET2MKT (.0001) (.0001) (.65) (.02) (.0002) 
VOL2PCT/ .160 .234. -.081 .150 .187 
RET2PCT (.05) (.01) (.82) (.03) (.002) 


3 The ordered response logit model fits the following function: 
g(Pr(CATsiX))=a, + BX 
1Sisk-1 


where in our case, k=3. The model fits the probability that an observation is from (ordered) category i or lower. Our categories 
are ordered by the difference between the trading volume decile and the price reaction decile: 


i= Large Volume-Small Price (volume decile exceeds price decile by five or more), 

2= Similar Volume and Price (absolute difference between price and volume deciles is less than or equal to two), 
and 

3= Small Volume-Large Price (price decile exceeds volume decile by five or more). 

Thus, a positive logit coefficient is associated with increased likelihood of a higher trading volume (relative to price) reaction. 

Tabled values are the logit coefficient when logit is performed on the scaled ranks of the independent variables, and the one- 

tailed significance level (in parentheses). 

* VOL2MKT is the firm's % of shares traded minus the % of NYSE firms’ shares traded, cumulated over days -1 and 
0. VOL2PCT is the firm's 2-day announcement period percentage of shares traded divided by the firm's median non- 
announcement percentage of shares traded, computed over contiguous 2-day non-announcement periods. 
RET2MKT is the absolute value of the cumulative CRSP beta excess return, cumulated over the 2-day event window. 
RET2PCT is the absolute value of the cumulative 2-day announcement period raw return, divided by the firm's median 
absolute non-announcement period raw return, computed over contiguous 2-day non-announcement period intervals. 

© DISP is the standard deviation of the analyst forecasts, deflated by the absolute value os mean forecast. 

3 NOA is the number of analysts forecasting the firm's earnings. 

* MKTVALUE is the market value of a firm's shares Dicumdne anite ferinmin TET 

‘ UEDIFF is the difference between random-walk-besed unexpected earnings (the absolute percentage forecast error 
from a seasonal random-walk model) and analyst-based unexpected earnings (the absolute percentage forecast error of 
the mean analyst forecast). 

3 PRICEUP equals 1 if the earnings announcement is associated with rising prices, and 0 otherwise. 





tests) for (1) earnings announcements falling in the highest quartile, and for (2) those falling in 
the lowest quartile of each characteristic.” The third column presents the difference between (1) 
and (2).* 


35 Nonparametric Wilcoxon tests lead to qualitatively similar inferences. Specifically, the magnitude of trading volume 
relative to price reaction is significantly higher for the highest quartiles of DISP, RANGE, NOA, UEDIFF, and 
PRICEUP, than for the lowest quartiles of each characteristic. 

**'The results reported in table 5 are based on earnings announcements with complete data for all six characteristics. This 
allows us to construct a single set of price and volume decile assignments for all six analyses, with the advantage that 
a particular announcement's price and volume decile assignments are the same across analyses for any of the six 
characteristics for which this observation falls into either the high or the low quartile. (This restriction is unlikely to affect 
our inferences. For each of the six characteristics, we recomputed new price and volume decile assignments after 
including all earnings announcements with data for that characteristic, and we repeated the analysis. Inferences from 
this sensitivity analysis are similar to those from table 5.) 
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TABLE 5 
Difference Between Trading Volume Reaction Decile and Price Reaction 
Earnings Announcements Falling in the Highest Quartile Versus the Lowe 
of Characteristics* 


Volume Reaction - Price Reaction 


L 
(a) (b) (Si, 
Volume-Price Highest Quartile Lowest Quartile 
Characteristic® Metric of Characteristic of Characteristic — (« 
DISP VOL2MKT-RET2MKT .4397 -.3407 
(.0001) (.001) 
VOL2PCT-RET2PCT .1697 -.2964 
(.06) (.001) 
RANGE VOL2MKT-RET2MKT 5058 -.5361 
(.0001) (.0001) 
VOL2PCT-RET2PCT .1988 -.3858 
(.03) (.0001) 
NOA VOL2MKT-RET2MKT .8002 -.5620 
(.0001) (.0001) 
VOL2PCT-RET2PCT .1889 -.0712 
(.03) (.25) 
MKTVALUE VOL2MKT-RET2MKT 1514 -.4917 
(.07) (.0001) 
VOL2PCT-RET2PCT -.1260 .0694 
(.91) (.73) 
UEDIFF VOL2MKT-RET2MKT .7358 -.1610 
(.0001) (08) 
VOL2PCT-RET2PCT .3192 -.1340 
(.001) (.11) 
PRICEUP* VOL2MKT-RET2MKT .1951 -2857 
(005) (.0003) 
VOL2PCT-RET2PCT .1224 -.2989 
(.05) (.0001) 


* Tabled values are the cross-sectional mean difference between the volume reaction decile and the 
reaction decile. Negative values mean that the price reaction decile is higher than the volume reactio 
values mean that the volume reaction decile is higher than the price reaction decile. The one-tailed s; 
difference (per matched-pair t tests) appears in parentheses in columns (a) and (b). 

> Variables are defined in table 3. 

* PRICEUP is a binary variable coded 0 if the earnings announcement is associated with a price decn 
associated with a price increase. All observations associated with price decreases are classified in the: 
"quartile" category, and all announcements associated with price increases are classified in the h 
"quartile" category. 
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The evidence reported in table 5 generally supports our expectations, as well as the results 
presented in tables 3 and 4. The highest quartiles of DISP and RANGE are associated with 
significantly higher volume than price reactions, while in contrast, the lowest quartiles of both 
DISP and RANGE are associated with significantly higher price reactions than volume reactions. 
The highest quartile of NOA is associated with significantly higher volume than price reactions, 
while the lowest quartile is associated with higher price than volume reactions (although the 
difference in the lowest quartile is not significant for VOL2PCT-RET2PCT; a = .25). For the 
highest quartile of UEDIFF, volume reaction is higher than price reaction, and conversely for the 
lowest quartile. Earnings announcements that generate price increases (i.e., high PRICEUP) are 
associated with higher volume than price reaction, while low PRICEUP is associated with higher 
price relative to volume reaction. Finally, the mixed MKTVALUE results in table 5 are consistent 
with the mixed evidence regarding the relation between MKTV ALUE and differential price- 
volume reactions reported in tables 3 and 4.? 

For all the characteristics except MKTV ALUE, the differences between the volume minus 
price reaction statistics for the (1) highest versus (2) lowest quartiles of the announcement- 
specific characteristics are consistent with expectations, and as shown in the third column of table 
5, the differences between (1) and (2) are significant at a < .04. Trading volume reaction is low 
relative to price reaction for the lowest quartiles of DISP, RANGE, NOA, UEDIFF, and 
PRICEUP. Volume is high relative to price reaction for the highest quartiles of these character- 
istics.” In sum, the analysis reported in table 5 supports inferences from the preceding univariate 
and multivariate analyses.” 


V. SUMMARY AND CONCLUDING COMMENTS 


The primary objective of this study is to provide empirical evidence on the extent to which 
accounting earnings announcements generate heavy trading but minimal price change, or vice 
versa. Although price and volume reactions to earnings announcements are significantly 
positively associated, on average, the frequency of earnings announcements that generate trading 
volume and price reactions of similar magnitudes is only 8-11 percent higher than would be 
expected under the null hypothesis that trading volume and price reactions are independent. While 
there is a significant positive association between volume and price reactions, this positive 
relation obscures the facts that (1) the relative magnitudes of price and volume reactions are 
extremely different for 20-24 percent of our sample earnings announcements, and (2) the 


?'Results for the 7-day window yield similar inferences for DISP, RANGE, NOA, and UEDIFF, but were generally 
insignificant for PRICEUP. 

55 We also repeated this analysis using standardized cardinal volume and price measures, instead of rank-based deciles. 
From log-transformed values of each announcement-specific cardinal volume (price) reaction metric (e.g., VOL2MKT 
oc RET2MKT,, etc.) we subtract the cross-sectional mean of that metric, and then divide by the cross-sectional standard 
deviatian of the metric. As in the rank-based decile analysis, we subtract the resulting return metric from the trading 
volume metric. Inferences from this analysis are qualitatively similar to those reported in table 5, except that in the 
cardinal analysis (1) the volume minus price reaction difference for the highest quartile of DISP is insignificantly 
positive (à = .15) using cardinal analogues of VOL2PCT-RET2PCT, and (2) in both the high and the low PRICEUP 
categories, the cardinal market-adjusted volume minus price reaction differences are not significantly different from 
Zero. 

?Tn the analyses that investigate the association between the characteristics and differential volume-price reactions 
(tables 3-5), our results are often stronger for VOLZMKT/RET2MKT than for VOL2PCT/RET2PCT, although the 
inferences are generally similar. This disparity in the strength of the results is not necessarily surprising. As explained 
in footnote 14, we cannot isolate a clean ^non-announcement" period, so VOL2PCT/RET2PCT's control for firm- 
specific average volume and absolute returns may also abstract from part of the announcement effect of interest. 
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observed relation between price and trading volume reactions is closer to independence than to 
a strong positive relation. 

Our evidence further suggests that earnings announcements that generate a high trading 
volume reaction relative to price reaction are associated with (1) more divergent financial analysts 
(predisclosure) earnings forecasts; (2) a large analyst following; (3) higher random-walk-based 
unexpected earnings relative to analysts-based unexpected earnings; and (4) price increases. 
These results are broadly consistent with the notion that trading volume reaction is likely to be 
high (relative to price reaction) when an announcement generates differential belief revisions 
among individual investors. 

We believe the evidence reported here highlights the potential for empirical trading volume- 
based research to yield new insights regarding the effects of information on stock market 
participants, since we find that trading volume often behaves differently than stock prices. 
Moreover, our evidence suggests that these differential reactions are related to announcement- 
specific characteristics. Such empirical evidence on factors associated with differential price- 
volume reactions can potentially help theoreticians develop more complete (and descriptively 
valid) theories of trading volume. From an empirical perspective, the study’s results suggest that 
it would be prudent for future information content research to continue to examine both price and 
volume reactions. 
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I. INTRODUCTION 


raditional approaches to auditing require a heavy commitment of valuable human 

resources—the time and professional judgment of experienced auditors. However, 

despite the central importance of human resources as the primary input in auditing, little 
is known about how environmental and institutional factors influence the allocation of these 
resources (Messier 1994; Brown and Solomon 1992). This study investigates how the use of 
"structured" audit approaches affects task-level human resource assignments in two audit 
environments varying in complexity. 

Both the professional and academic accounting literatures assert that structured approaches 
can facilitate the use of relatively inexperienced decision makers to perform judgment-oriented 
tasks (e.g., Elliott and Kielich 1985; Ashton and Willingham 1988; Libby and Luft 1993). From 
a theoretical perspective, structured approaches may reduce the need for decision-maker 
experience through formalization and standardization of common tasks, removal of strategic 
decisions from the field auditor's discretion, and use of decision tools which allow for “knowl- 
edge-sharing." The accounting and decision-making literatures have also posited that structured 
approaches are likely to be less applicable in more complex environments, suggesting that the 
effects of audit structure are potentially contingent on the setting in which itis employed. Finally, 
this paper argues that by providing a common language and invoking inherently more explicit and 
consistent delineations of the steps involved in particular tasks, the use of structured approaches 
is likely to decrease the variability of staffing decisions across auditors. This study represents the 
first effort reported in the literature to examine directly the effects of structured approaches on 
auditing firms' human resource assignments. 

An experiment employing firm and task-level audit structure measurements and a two-level 
manipulation of the environment provides data to test predicted differences in audit managers' 
task-level staffing assignments. Managers from four of the Big-6 public accounting firms (two 
from each extreme of the "structure" continuum as defined by previous studies) responded to one 
of two versions of a pretested case.! Response materials solicited audit managers’ judgments as 
to minimum experience levels required for the performance and for the initial supervision/review 
of alist of 19 judgment-oriented audit tasks for a hypothetical client. Respondents also completed 
an exit survey. Finally, two raters independently generated "structure" ratings for each of the 19 
judgment-oriented audit tasks for each of the four firms. 

This study yields five principal findings. First, large auditing firms continue to differ in terms 
of the "structuredness" of their audit approaches. Second, structured firm managers assigned less 
experienced auditors than did unstructured firm managers to perform and to initially supervise/ 
review audit decision tasks. Third, differences in experience assignments are associated with 
differences in structure measures at the task level, providing evidence that task-level structure 
rather than a firm-level mediating variable is the operational construct. Fourth, structured and 
unstructured firm managers responded differently to environmental complexity. While unstruc- 
tured firm managers increased their reliance on relatively experienced auditors in response to the 
complexity manipulation, structured firm managers generally did not increase required minimum 
experience levels of regular audit team members, contrary to expectations. However, exit survey 
results indicate that structured firm managers responded to complexity by increasing their 
reliance on specialists. Finally, structured firm managers' staff assignments were less variable 


! In accordance with agreements made with the participating firms, the firms cannot be specifically identified, and results 
are presented only in aggregate. Two versions of the case were used to manipulate environmental complexity. 
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(between subjects) than those of unstructured firm managers. Variance of task-level experience 
assignment is inversely related to task-level audit structure, again providing evidence that results 
are not driven by a firm-level mediating variable. 

The next section of the paper provides definitions of audit structure and environmental 
complexity. Section III reviews relevant literature and formulates testable hypotheses. Section IV 
provides an overview of the research design, development of the test instrument, and generation 
of the structure ratings. Section V presents results and discussion relating to statistical tests. 
Section VI concludes the paper and provides direction for future research. 


IL DEFINITION OF CONSTRUCTS 
Audit Structure 


Based on a systematic review of the written in-house audit guidance materials of 12 large 
auditing firms, Cushing and Loebbecke (CL) (1986) identified differences in the degree of 
“structure” found in the firms’ audit processes, as evidenced by the firms’ documentation. CL 
defined a "structured audit methodology” as: 

A systematic approach to auditing characterized by a prescribed, logical sequence 
of procedures, decisions, and documentation steps, and by a comprehensive and 
integrated set of audit policies and tools designed to assist the auditor in conducting the 
audit (CL 1986, 32). 


These policies and tools range from written guidance provided in audit manuals to formulas 
for determining materiality and sample size to preprinted checklists and computer packages 
designed to assist in sets of judgments such as audit program preparation. Audit structure has the 
effect of restricting selected data inputs, data combination approaches, and information outputs 
to specific subsets (Lewis et al. 1983; Fry and Slocum 1984). In summary, a “structured” audit 
approach can be seen as the prescribed implementation of comprehensive and integrated policies, 
procedures, and decision tools to facilitate the transformation of judgments and evidence into an 
audit opinion (Ulrich and Weiland 1980; Bamber and Snowball 1988). 


Environmental Complexity 


For purposes of this study, the environment in which an audit is conducted is defined as the 
set of client attributes that can affect the complexity of judgment-oriented audit tasks. A relatively 
complex audit environment is one that tends to increase the quantity or decrease the clarity of 
information which must be considered in the input, processing, and output phases of a judgment- 
oriented task, thus increasing the demands made on the decision maker’s cognitive capacity 
(Bonner 1994). For example, identification of critical audit areas for a client with managerial 
instability and accounting system problems would likely require consideration of a greater 
number of cues that are relatively difficult to identify and measure, processing of these cues in 
the presence of a lower level of clarity in the overall relation between input cues and output, and 
output which is likely to take a greater number of forms and which may be less subject to objective 
testing criteria. Factors that can affect the complexity of tasks such as materiality and risk 
assessment can be grouped into three general categories: nature of the client, the client’s 
accounting system, and environment in which the client operates (Bamber and Bylinski 1982). 
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HI. FORMULATION OF RESEARCH QUESTIONS 


Considerable research effort has been directed at examining the effects of structured decision 
aids on the quality or consistency of judgments (e.g., Lewis et al. 1983; Jiambalvo and Waller 
1984; Butler 1985; Aldag and Power 1986; Sharda et al. 1988; Libby and Libby 1989; Ashton 
1990; McDaniel 1990; Mackay et al. 1992). A related line of research in the auditing literature 
has investigated the effects of relatively "structured" audit approaches used by some of the largest 
auditing firms on the judgments of these firms' auditors in different environments (e.g., Kinney 
1986; Bamber and Snowball 1988; Morris and Nichols 1988; Dirsmith and Haskins 1991). In 
general, these two streams of research provide evidence that structured decision approaches can 
increase judgment consistency and accuracy, given that the decision maker is familiar with the 
technologies employed (e.g., Morris and Nichols 1988; Libby and Libby 1989; McDaniel 1990). 
However, prior research has not provided direct empirical evidence on how the use of decision- 
maker resources is affected by structure and environment. 

While no empirical evidence has been available, both the professional and academic 
accounting literatures commonly cite “knowledge sharing" and the consequent ability to use 
auditors of lesser experience as advantages of structured approaches (e.g., Mullarkey 1984; 
Elliott and Kielich 1985; Ashton and Willingham 1988; Libby and Luft 1993). However, other 
authors have posited that as the environment becomes more complex, the "fit" between structure 
and environment is likely to decrease. Thus, the applicability and effects of structured approaches 
may be contingent on the degree of complexity present in the environment (Bamber and Bylinski 
1982; Bamber and Snowball 1988). The following subsections develop expectations relating to 
these issues. 


Structure and Auditor Experience 


The first standard of fieldwork (SAS No. 1, AICPA 1972) requires the assignment of auditing 
tasks to staff with an appropriate level of skill, experience, and training to perform the task. In 
meeting this standard, managers must take into account at least two important factors. First, the 
cognitive requirements of a judgment-oriented task can vary with the way the underlying task is 
structured, the amount and type of guidance available or prescribed to the auditor performing the 
task, and the environment in which the task is performed (Libby and Luft 1993; Bonner 1994). 
Second, experienced auditors generally have more complete and more refined knowledge bases 
with which to perform judgment-oriented audit tasks than do inexperienced auditors (e.g., Waller 
and Felix 1984; Libby 1985; Bonner 1990; Libby and Frederick 1990; Libby and Luft 1993). 
Thus, the nature of the task, the way the task is structured, the environment, and the amount and 
type of guidance available and prescribed can affect resource allocation decisions as the audit 
manager attempts to match task demands with decision-maker experience and ability (Jiambalvo 
and Pratt 1982; Abdolmohammadi and Wright 1987; Abdolmohammadi 1987). In view of the 
steeply scaled pay and billing rates in major auditing firms, if structure allows for favorable 
tradeoffs in the experience level of the auditors required to adequately perform particular tasks, 
such tradeoffs could represent an important aspect of human resource efficiency (Libby and 
Frederick 1990; Gist 1994). 

Structured approaches can reduce the need for auditor experience through “knowledge- 
sharing," through the formalization and standardization of common tasks, and through the 
removal of strategic-level judgments from the discretion of the field auditor. In terms of 
knowledge-sharing, structured decision technologies can be seen as a medium through which the 
collective experience of an accounting firm is conveyed to individual auditors—in essence 
pushing the expertise of relatively experienced auditors to lower organizational levels (e.g., 


[$ 
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Elliott and Kielich 1985; Abdolmohammadi 1987; Elliott and Jacobson 1987; Messier and 
Hansen 1987; Libby and Luft 1993; Ashton and Willingham 1988).* Further, once a judgment 
approach has been formalized and standardized, the role of the decision-maker changes from 
“judge” (selecting which inputs to use and deciding how to combine them) to “operator” (simply 
operating the “machine”), thus reducing cognitive demands on the decision-maker (e.g., 
Lichtenstein et al. 1977; MacGregor, et al. 1984). Finally, structured audit methodologies can 
move strategic judgments away from the field auditor by prescribing outcomes, given a set of 
predefined informational inputs and lower-level component judgments (Kinney 1986). Consis- - 
tent with this argument, CL (1986, 33) characterize structured audit methodologies as providing 
"explicit guidelines for reaching specific conclusions based upon the nature of the evidence 
relating to particular factors." 

Prior research indicates that "experience effects" are likely to be more pronounced in tasks 
which require more judgment and expertise on the part of the auditor (e.g., Abdolmohammadi 
1987; Abdolmohammadi and Wright 1987; Messier and Hansen 1987). Given this observation, 
arational response on the part of the audit manager is to allocate less experienced auditors to more 
structured tasks where independent judgment and expertise matter less, and more experienced 
auditors to less structured tasks where the judgment of the individual auditor is more critical. 
Extending this argument, if similar underlying tasks are more structured in one set of firms than 
in another, we might expect to observe predictable differences in required minimum experience 
levels across structure category. The following hypothesis formalizes this expectation: 


Hla: PEL,«PEL,, where PEL, and PEL, are the average experience levels assigned for the 
performance of audit tasks in structured and unstructured firms, respectively. 


Structured decision approaches can also be expected to allow for less experienced personnel 
to supervise and review the judgmient-oriented tasks performed by junior auditors. Such 
approaches may improve both horizontal and vertical communication by establishing a common 
terminology and a common basis for understanding (e.g., Williamson 1975; Mullarkey 1984; CL 
1986; Elliott and Jacobson 1987; Williams and Dirsmith 1988). Further, structured approaches 
will likely cause subordinates' judgment processes and outputs to be formulated and presented 
more consistently across time and across auditors, making them more easily understood by the 
reviewer, who will also be familiar with the technology used to formulate the judgment outcomes. 
Finally, since structured approaches often involve the prescribed combination of prespecified 
inputs, the supervision/review function in structured firms will likely be more oriented toward 
ensuring compliance with the firm's prescribed approach and investigating deviations therefrom 
than in unstructured firms, in which the focus will likely be on judging the reasonableness of 
subordinates' work in terms of information inputs selected, component judgments made, and 
combination approach used to formulate the judgment outcome and derivative conclusions. 
These expectations are formalized in the following hypothesis: 


H1b: SEL, < SEL, where SEL, and SEL, are the average experience levels assigned for the 
initial supervision/review of audit tasks in structured and unstructured firms, respec- 
tively. 


2 The term knowledge sharing, as used in this study, simply refers to the role of structured decision technologies in 
allowing relatively high-level knowledge to be applied by personnel of relatively little experience. Whether this happens 
as aresultofthe actual transfer of knowledge to the user of the technology, or whether the knowledge is simply embedded 
in the technology which is mechanistically applied bv the user, likely depends on the particular technology as well as 
on the setting in which it is applied. Though this distinction is not of central importance in this study, further research 
in this area might prove interesting and productive. 
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Prior studies have typically operationalized audit structure only at the firm level. Such an 
approach cannot rule out the possibility that observed effects are due to exogenous firm-level 
variables (e.g., differences in characteristics of human resource pools, culture, client portfolio, 
etc.) that are confounded with firm-level structure measures. Tf predicted differences in experi- 
ence requirements are attributable to structure differences and not to a firm-level mediating 
variable, then where task-level structure differences are large (small), differences in required 
experience levels should also be large (small) in the predicted direction. Thus, to provide 
assurance that results are not driven by an exogenous firm-level mediating variable, tests of H1a 
and H1b will include analysis of the relationship between task-level structure measures and 
staffing assignments. 


Interactive Effects of Structure and Environment 


Structured decision technologies mediate the execution of a task in a given environment by 
formalizing and standardizing the processes of environmental cue selection, cue combination, 
and information output formulation (Libby and Luft 1993; Bonner 1994). If a technology 
adequately takes into account all critical cues in the environment, a match exists between the 
technology and the demands ofthe task. Preformulated approaches can be designed to accommo- 
date a degree of complexity in the environment, but greater complexity diminishes the likelihood 
of an adequate "fit" between a preformulated approach and task requirements (Ballew 1982; 
Bamber and Bylinski 1982; Sullivan 1984; CL 1986; Bamber and Snowball 1988). Thus, if 
important environmental cues are present which are not adequately accounted for by a preformu- 
lated approach, auditors are forced to improvise (see Jiambalvo and Pratt 1982), requiring a higher 
level of judgment to adequately perform the task than was previously necessary. 

Structured firm managers are expected to assign less experienced auditors to tasks through 
reliance on preformulated, structured approaches (see H1). However, as environmental complex- 
ity increases, these managers can be expected to increase the level of experience brought to bear 
onagiventaskto a greater degree than will unstructured firm managers because the preformulated 
approaches are less likely to be effective in such environments. The following hypothesis 
summarizes the expectation that the experience differences predicted in H1 will be moderated 
with increased environmental complexity: 


H2a: PEL, - PEL, > PEL ya - PEL,, Where PEL, and PEL, are the average experience 
levels assigned for the performance of audit tasks in structured firms in higher and 
lower complexity environments, respectively, and PEL,,, and PEL „are the average 
experience levels assigned for the performance of audit tasks in unstructured firms in 
higher and lower complexity environments. 


H2b: SEL, - SEL, > SEL, u - SEL, , where SEL, and SEL, are the average experience 
levels assigned for the initial supervision/review of audit tasks in structured firms in 
higher and lower complexity environments, respectively, and SEL, „and SEL „are the 
average experience levels assigned for the initial supervision/review of audit tasks in 
unstructured firms in higher and lower complexity environments. 


Audit Structure and Consistency in Human Resource Allocation 


Reduction in the "noise" variation of audit judgments has been offered as a primary 
explanation forthe introduction and use of structured decision approaches (e.g., Mullarkey 1984; 
Elliott and Jacobson 1987; Ashton and Willingham 1988), and some evidence exists that 
structured approaches can decrease judgment variability (e.g., Morris and Nichols 1988; Libby 
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and Libby 1989; McDaniel 1990; Dirsmith and Haskins 1991; Dilla and Stone 1994). Along these 
lines, the matching of auditor experience levels with audit tasks in a given environment can be 
considered an important planning judgment which might itself be susceptible to the effects of 
structured approaches. . 
While structured firms do not specifically prescribe the experience level required to perform 
or supervise/review a particular audit task, they do provide guidance regarding staffing respon- 
sibilities for each audit engagement (CL 1986). In addition, audit structure may facilitate 
consistent task assignments by providing a common language and by invoking inherently more 
explicit, and thus more consistent, delineations of the steps involved in performing and 
supervising/reviewing particular tasks (Williamson 1975; Bamber et al. 1989). This study 
therefore predicts that firms using relatively structured audit methodologies will show greater 
within-firm consistency in human resource allocations for both performance and initial supervi- 
sion/review of audit tasks. 


H3a: PV, « PV,, where PV, and PV, are the between-subject variances in experience level 
assignments for the performance of audit tasks in structured firms and unstructured 
firms, respectively. 


H3b: SV, « SV,, where SV, and SV, are the between-subject variances in experience level 
assignments for the initial supervision/review of audit tasks in structured firms and 
unstructured firms, respectively. 


To provide assurance that results are not driven by an exogenous firm-level mediating 
variable, tests of these hypotheses will include analysis of the relationship between structure 
measures and variance at the task level. 


IV. RESEARCH METHOD 
Overview 


This study employs a two-by-two, between-subjects experiment to test the propositions 
described above. The dependent variables are task-level staffing assignment for the performance 
and for the initial supervision/review of each of 19 judgment-oriented audit tasks. A dichotomous 
firm-level measure of audit structure and a two-level manipulation of environmental complexity 
serve as independent variables. The analysis also uses task-level structure measures to rule out 
possible firm-level alternative explanations. 


Test Instrument, Procedures, and Subjects . 


The test instrument includes an audit case together with instructions and an exit survey to 
provide data for manipulation and bias checks. To ensure the understandability, realism, and 
adequacy of the materials, a manager and a partner from each of the firms participated in pilot 
tests. Case instructions emphasized the importance of the realism of the responses and of not 
discussing the case until after completing and returning the test instrument. In addition, 
instructions indicated that the purpose of the research was to find out how auditors at different 
experience levels are employed in practice and that there were no right or wrong answers. 
Respondents were not aware that other firms were participating. Finally, the respondents were 
asked not to allow any short-term shortages of personnel at their local offices to influence their 
staffing assignments. | | 
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Subjects are audit managers from four firns—two from each end of the audit structure. 
continuum as defined by previous studies.? Contact partners within each of the firms distributed 
the case materials to participants.* 


Dependent Variables 


Using the CL (1986) generic model of a GAAS audit, 19 judgment-oriented audit tasks were 
identified (see figure 1)? A response sheet solicited the minimum rank level required to 
effectively perform and to initially supervise/review each of the 19 tasks in the context of the 
hypothetical client. Interviews conducted during pretesting indicated that audit managers 
typically conceptualize staffing assignments in terms of auditor rank level rather than years of 
experience. Thus, to maintain task realism, subjects indicated the experience of the auditor 
assigned to perform and to initially supervise/review each task in terms of the following rank- 
level breakdown, where subscripts indicate time-in-grade: Staff, ,, Staff, ,, Staff, ., Senior 


0-1? 1-2? »2* 0-1? 
Senior, ;, Senior,,, Manager, ,, Manager, „ Manager,,, Senior Manager, ,, Senior Manager, ,, 
Senior Manager, ,, Partner 


o. and Partner. 

While the experiment solicited rank-level responses to maintain task realism, systematic 
time-to-promotion differences among the firms might make rank-level responses unreliable 
indicators of tenure-based experience. Thus, the exit survey solicited information on the number 
of months responding managers had spent at each prior rank level so that rank-level responses 
could be transformed into the average years' experience of auditors assigned to the tasks. Rank- 
level responses were individually transformed within firms to a measure of average years of 
experience through a simple process of interpolation.® The results section of this paper discusses 
analyses using both transformed and untransformed rank-level responses. 


3 Large auditing firms offer a natural setting for examining the effects of the use of structured decision approaches on 
human resource allocation. Due to the existence of professional standards which establish a common conceptual 
framework, on a general level all four firms studied here have conceptually compatible approaches to the conduct of 
a GAAS audit. Auditors across all four firms are required to perform similar tasks and to make conceptually similar 
judgments for a given client (e.g., materiality assessment, sample-size judgments, etc.), but decision guidance varies 
considerably across the firms. 

* While the nature of the experiment did not allow direct researcher control over distribution of the test instrument, 
measures were taken to encourage random distribution of the cases, An addressed, stamped return envelope was 
provided so that responding managers could ensure that their responses were not seen by peers or superiors. The packet 
sent to contact partners contained unlabeled copies of both versions of the case arranged in alternating order, and an 
instruction sheetreiterating previous verbal instructions on case distribution. This process appears to have been effective 
in mitigating possible distribution bias. Posthoc checks indicate no differences between complexity conditions in terms 
of years of respondent experience, time-in-grade, ycars at current office, or specific experience with manufacturing 
clients. 

While the term task is used to describe the 19 judgment arcas used in this paper, it should be noted that many of the tasks 

include several component judgments, which could in themselves be considered "audit tasks" at a different level of 

abstraction. Further, while more than one audit team member might be involved in several of the tasks examined here, 
results of this paper relate to experience levels of team members primarily responsible for performing and initially 
supervising/reviewing them. 

The transformation was accomplished as follows. Average time to cach rank level was computed individually for each 

A using responses from the exit survey. Responses indicating use of auditors in the first rank 

auditors with zero to one year’s experience—were coded .5 for all firms. This code corresponds to 

Bc obo o nne oben indicating that auditors in that category have an average experience of half 

a year. Data on individual firms’ average number of months spent at each rank level were then used to code the remaining 

rank subcategories at their midpoints through interpolation. This process was followed up to the manager level, after 

which one year was simply added to the manager-level number for each rank subcategory thereafter, implicitly assuming 
no promotion differences beyond the manager level. 


CA 
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FIGURE 1 
Conceptual Break-Down of the Audit Process into 19 Judgment-Oriented Audit Tasks 
(Adapted from Cushing and Loebbecke 1986) 


PRE-ENGAGEMENT & PLANNING ACTIVITIES 


Obtain Knowledge of Business (OKB) 

Perform Preliminary Analytical Review (PPA) 

Make Preliminary Risk Assessment (MPR) 

Make Preliminary Materiality Assessment (MPM) 
Design Substantive Analytical Review Procedures (DSA) 
Design Substantive Tests of Details (DST) 

Determine Substantive Sample Approaches, Sizes (DSS) 
Prepare Audit Programs (PAP) 

Prepare Staffing & Time Budgets (PST) 


INTERNAL CONTROL TESTING & ASSESSMENT ACTIVITIES 


Assess Client's Control Environment (ACC) 
Review System of Internal Controls (RSD 

Identify and Document Critical Audit Areas (IDC) 
Review Audit Plan for Needed Modifications (RAP) 


SUBSTANTIVE TESTING ACTIVITIES 


Aggregate Results of Substantive Tests (ARS) 
Evaluate Results of Substantive Tests (ERS) 


OPINION FORMULATION & REPORTING ACTIVITIES 


Make Final Review of Financial Statements (MFR) 
Aggregate & Evaluate Audit Results (AEA) 
Decide on Appropriate Audit Opinion (DAA) " 
Draft Audit Report (DAR) 
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Independent Variables 
Environmental Complexity 


The case includes narrative information and financial statements covering the last three 
years. The hypothetical client, Ocean Manufacturing Inc., is a generic, medium-sized manufac- 
turer of household electrical appliances with approximately $110 million in sales and $70 million 
in total assets. To manipulate the environmental complexity variable, the following narrative 
elements were added in the higher complexity case: (1) Ocean Manufacturing’s production 
capacity had been expanded significantly over the last two years; (2) the management situation 
was “unsettled” due to some significant recent management turnover and the hiring of an 
inexperienced and “confused” new controller; (3) the company used a “complicated form of 
process costing;” and (4) the company had recently undergone a difficult transition to a new 
computerized accounting system, with problems remaining in inventory tracking, cost accumu- 
lation, receivables aging and billing, payroll deductions, payables balances, and balance sheet 
account classifications. The higher complexity case also indicates that audit trails were not 
maintained during several one-week periods during the year due to complications associated with 
the implementation of the new system. Apart from these changes in the narrative description of 
the client and its environment, the materials for the two cases are identical. 


Structure Ratings 


Rather than rely on previous measures of structure, one phase of this study included the 
development of an independent set of structure ratings. ‘This work was undertaken because the 
most recent structure ratings were at least six years old at the commencement of this study, and 
researchers are questioning the continuing validity of audit structure as a viable research construct 
in view of apparent continued change in the degree of structuredness by major auditing firms 
(Kinney 1986; Dirsmith and Haskins 1991). Further, no task-level measures of audit structure 
were available. 

Task-level ratings are important because, ex ante, the relative degree of structuredness across 
firms is not necessarily consistent across all parts of the audit, making firm-level measures 
potentially unreliable for studies examining task-level constructs. In addition, task-level structure 
measures allow for analysis to address the concern that observed differences in the dependent 
variable may result from a firm-level mediating variable rather than from task-level structure 
differences. 

Task-level structure measures were obtained through the following steps: (1) audit guidance 
materials (including audit manuals, printed field aids, and descriptions of electronic tools) were 
obtained and were reviewed with a manager or partner from each of the participating firms to 
ensure the completeness of the materials; (2) the materials were catalogued by the researcher 
according to a breakdown of the audit into 19 major audit judgment task areas, closely following 
the generic model developed by CL (1986); (3) the materials were rated independently by two 
raters other than the researcher for each of the 19 tasks. The raters independently analyzed the 
materials by applying a simple algorithmic process very similar to that described by CL (1986)." 
Output from this process consists of structure ratings on a scale from one to five for each of the 
19 audit tasks for each firm. 


7 The following instructions were given to the raters prior to beginning the rating process: “Assign a score to each set of 
andit guidance materials relating to each of the 19 audit tasks listed, based on the following rules. If the task is not dealt 
with in the firm's materials, or is dealt with in only a short passage in narrative form, assign a score of one. If the task 
is dealt with in a section of moderate length in narrative form and with primarily qualitative guidance, assign a score 

(Continued) 
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V. STATISTICAL TESTS AND RESULTS 


Structare Ratings Results 


The ratings provided by the two raters show a high degree of association given the subjective 
nature of the rating task (see table 1). Spearman rank correlations were computed on the raters’ 
rankings for each of the 19 sets of ratings across the four firms (n = 4 for each correlation 
computed). Spearman correlation is an appropriate indicator of whether raters are using similar 
evaluation criteria and whether they have similar perceptions of the objects to be rated in terms 
of the evaluation criteria (see Gibbons 1985, 278). As indicated in table 1, the raters' rankings are 
correlated at .90 or above for 10 of the 19 tasks, and the average rank correlation across the 19 
sets of ratings is .75.? 

For 16 of the 19 sets of averaged ratings, the same two firms are indicated as structured firms, 
and average overall measures from the two raters indicate the same structured/unstructured 
dichotomy at the firm level. Differences in average firm-level ratings within the structured and 
unstructured categories are small (.06 and .05, respectively), while the difference in average 
overall ratings between structure categories is relatively large (1.48), indicating systematic and 
pervasive differences in the degree to which structured approaches are used in the firms studied 
here. These results confirm prior structure categorizations and indicate that the participating firms 
are quite cleanly dichotomized at the firm level? 


Manipulation and Bias Checks 


Of 740 cases distributed to audit managers in various offices from the four participating firms, 
355 were returned for an overall response rate of 48 percent.? Average experience of the 
respondents was just over eight years (SD = 2.2). Difference in respondent experience across 
structure category was not significant (8.1 and 8.3 years for structured and unstructured firms 


Footnote 7 continued 

of two. If the task is dealt with in a section of moderate length which includes nonnarrative types of material and analytical 
guidance, or is dealt with in a section of extensive length, then judge the degree to which the guidance as a whole relating to 

that particular task is analytical in nature. If the guidance is judged to be relatively qualitative in nature, assign a score of three. 
If the guidance is judged to be relatively analytical in nature, then assess the degree of integration of the materials relating 
to that task with materials relating to other tasks. If the degree of integration is judged to be low, assign a score of four; if high, 
assign a score of five. Qualitative guidance is guidance which concentrates primarily on identifying those factors that should 
be considered by the auditor in forming a particular judgment. Analytical guidance is guidance that also provides explicit 
guidelines for reaching specific conclusions based upon the nature of the evidence relating to particular factors” (adapted from 
CL 1986, 33-34). The raters were also provided a simple flow-chart diagram of the above narration. 
The canonical correlation between the two raters' sets of multivariate ratings (for the 19 tasks and four firms) was also 
computed. The two multivariate sets of ratings are highly correlated —R = .89. 
Responses to the exit survey provide further evidence in support of the firm-level structure dichotomy used in this study. 
Consistent with the definition of audit structure as the prescribed usc of standardized, formalized audit technologies, 
auditors in structured firms perceived a stronger expectation to adhere to their firms’ audit approaches than did managers 
from unstructured firms (9.0 and 8.6 respectively on ten-point scale, t=2.7, p«.01). Relatedly, structured firm managers 
indicated a higher level of agreement with the statement "My firm strongly encourages use of the firm's practice aids, 
decision tools, etc., in almost every audit" (1.9 and 1.5 respectively on -3 to +3 scale, t=2.85, p«.01). Also consistent 
with expectations, structured firm managers’ in-house audit materials played a more important role in influencing their 
responses to the case than did the in-house andit materials of unstructured firm managers (7.4 and 6.4 respectively on 
ten-point scale, t=4.2, p<.001). Finally, consistent with the notion that structured approaches increase the consistency 
of procedures applied across audits, and with findings from prior studies (e.g., Bamber and Snowball 1988), structured 
firm managers agreed more with the statement "In my experience, one audit seems very similar to the next in terms of 
tasks and procedures performed" (-.06 and -.4 respectively on -3 to +3 scale, t=1.8, p«.04). 
10 Structured firm managers responded at a rate of 49 percent, compared to 47 percent for unstructured firm managers. 

Forty-six percent of the higher complexity cases were returned, compared to 50 percent of the lower complexity cases. 


o 
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OKB = Obtain Knowledge of Business 
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TABLE 1 
Task-Level Structure Ratings" 
(Maximum - 5; Minimum - 1) 

Interrater 

Reliability 

Average Rating for Average Rating for Diffin (Spearman 

Unstructured Firms Structured Firms Average Correlations 
Audit Overall Overall Ratings Between 
Task Rater Average Rater Average (S-U) Raters)** 
I 2 1 2 
OKB 2.5 1.5 2.0 4.0 3.5 3.8 1.8 0.95 
PPA 1.5 1.0 1.3 4.0 3.5 3.8 2.5 0.74 
MPR 2.0 1.5 1.8 5.0 5.0 5.0 3.3 0.94 
MPM 1.0 1.0 1.0 5.0 5.0 5.0 4,0 1.00 
ACC 3.5 2:3 2.9 5.0 5.0 5.0 2.1 1.00 
RSI 3.3 3.0 3.1 4.3 5.0 4.6 1.5 0.95 
IDC 2.0 1.5 1.8 2.5 1.5 2.0 0.3 0.58 
DSA 2.0 1.5 1.8 4.0 2.3 3.1 1.4 0.94 
DST 4.0 4,3 4.1 4.5 4.8 4.7 0.5 0.83 
PAP 4.3 4.8 4.5 4.5 4.3 4.4 -0.1 0.50 
DSS 3.0 3.5 3.3 5.0 5.0 5.0 1.8 0.94 
PST 3.0 1.8 2.4 3.0 1.8 2.4 0.0 0.00 
ARS 1.5 25 2.0 3.0 4.5 3.8 1.8 0.74 
ERS 2.0 2.5 2.3 3.5 4.8 4.1 1.9 0.74 
RAP 1.5 1.0 13 3.0 LI 2.0 0.8 0.94 
MFR 1.5 5.0 3.3 3.0 5.0 4.0 0.8 0.00 
AEA 1.3 1.5 1.4 3.3 3.5 3.4 2.0 1.00 
DAA 2.0 1.8 1.9 4.0 3.5 3.8 1.9 0.90 
DAR 4.5 3.0 3.8 4.6 3.1 3.9 0.1 0.58 
AVG 2.4 2.4 2.4 4,0 3.9 3.9 1.5 0.75 
Key to Task Abbreviations: 


DSS = Determine Substantive Sample Approaches, Sizes 


PPA = Perform Preliminary Analytical Review PST = Prepare Staffing & Time Budgets 


MPR = Make Preliminary Risk Assessment 


ARS = Aggregate Results of Substantive Tests 


MPM = Make Preliminary Materiality Assessment ERS = Evaluate Results of Substantive Tests 


ACC = Assess Client's Control Environment 
RSI = Review System of Internal Controls 


RAP = Review Audit Plan for Needed Modifications 
MFR = Make Final Review of Financial Statements 


IDC = Identify and Document Critical Audit Areas AEA = Aggregate & Evaluate Audit Results 
DSA = Design Substantive Analytical Review Procedures DAA = Decide on Appropriate Audit Opinion 
DST = Design Substantive Tests of Details DAR = Draft Audit Report 


PAP = Prepare Audit Programs 
*See footnote #7 for a description of the ratings 
** n-4 for each correlation-—four firms rated for each task 
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respectively, t=1.08, p=.28), nor was the difference across case version (8.1 and 8.2 years for 
higher and lower complexity conditions respectively, t=.44, p=.66). No significant differences 
were found between structure categories in terms of the number of times the respondents had been 
involved with audit clients of similar size and risk as the one described in the case (9.4 and 8.8 
times for structured and unstructured firms respectively, t=.43, p>.65). 

Respondents indicated a fairly high level of confidence in the representativeness of their 
responses to a “real-world” audit (mean=7.4, median=8, SD=1.6, with 1 indicating “not 
representative,” and 10 indicating “very representative”), and indicated that the information 
provided in the case was adequate for the purposes of completing the required tasks (mean=6.5, 
median=7, SD = 1.8, with a 1 indicating “not at all adequate,” and a 10 indicating “very 
adequate”). Responses to these two questions were not significantly different between structure 
categories, evidence that the case materials did not induce an a priori bias in terms of the responses 
solicited (7.5 and 7.3 for structured and unstructured firms respectively, t=1.1, p=.27, and 6.5 and 
6.4 for structured and unstructured firms respectively, t=.71, p=.48). Finally, higher complexity 
case respondents indicated that they considered the audit scenario in the case to be more complex 
relative to most ordinary audits than did the lower complexity case respondents (5.8 and 4.3 
respectively, t=9.1, p<.001), indicating a statistically significant difference in perceived com- 
plexity across environment conditions. 


Structure and Required Experience Levels (H1) 


To assess the effects of structure on experience level for the performance and supervision/ 
review of judgment-oriented audit tasks, data were analyzed using a Multivariate Analysis of 
Variance (MANOVA), with firm-level structure and environment as dichotomous independent 
variables, and experience levels for the performance and initial supervision/review of the 19 audit 
judgment tasks as dependent variables.!': !? : 


Structure and Experience levels for Performance of Judgment Tasks (Hla) 


Hla predicts that on average structured firm managers will assign lower experience levels 
than unstructured firm managers for the performance of judgment-oriented audit tasks. Table 2 
shows results from the MANOVA model with required experience level for the performance of 
each of the 19 tasks as dependent variables and firm-level structure and environmental compiex- 


"MANOVA provides an appropriate overall measure of significance in the presence of multiple dependent measures that 
are likely to be intercorrelated (Johnson and Wichern 1992; Rencher 1995). In addition, when the MANOVA test of 
significance for a given independent variable is significant, the individual significance levels of the related univariate 
analyses of variance (ANOVAs) can be relied on without inflating the overall alpha level of the family of multiple tests 
(Rencher and Scott 1990). MANOVA requires omission of observation sets for which the response on one or more of 
the dependent variables is missing. Of 355 response sheets returned, 273 contained observation sets with no missing 
values for performance assignments across the 19 dependent variables, and 299 contained complete observation sets 
for supervision/review assignments. The data points omitted from the multivariate analysis are distributed nearly 
equally across experimental cells. Univariate analyses of the incomplete observation sets indicate that the omitted 
Observations are not qualitatively different from those included in the multivariate analysis. 

12 As indicated previously, three of the 19 task-level structure rating sets (IDC, PAP, PST) did not produce the same 
structure dichotomy as that indicated by the other 16 sets of ratings. Since MANOVA requires a single definition of each 
independent variable, firm-level structure ratings were of necessity used. The MANOVA model was estimated both 
with and without these three dependent variables. The results were qualitatively identical. Reported univariate results 
are based on appropriate task-level structure dichotomies. Univariate ANOVAs were repeated using the experience 
level of the respondent as a covariate. The covariate, while in some cases individually significant, had no qualitative 
effect on the results reported. 
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TABLE 2 


MANOVA: Effects of Structure and Environment on Auditor Experience Assignments 
for the Performance of 19 Judgment-Oriented Tasks 


MANOVA Model: Y 7 Ht Ot Bet Yt En 
where: is a vector of responses in 19 dependent variables, o, is the effect 


k 
i A (structure) on each of the 19 variables in Yn B, is the effect of the 
jth level of B (environment), and y, is the AB interaction 1 effect. 


Test Statistic Numerator Denominator F-Test 

Value (Wilks' Degrees of Degrees of Probability 
Variable Lambda) Freedom Freedom F-Ratio _ Level (Alpha) 
Structure 0.71 19 251 5.33 0.0000 
Complexity 0.92 19 251 1.16 0.2943 
Interaction 0.90 19 251 1.51 0.0833 


ity as independent variables. These results indicate that the effect of structure on the dependent 
variables is highly significant (p<.0001).1 

Nineteen univariate ANOVA models were subsequently estimated in order to determine the 
direction of the experience differences and to explore which of the dependent variables are likely 
to be most important to the overall significance of the multivariate tests (see Rencher and Scott 
1990; Johnson and Wichern 1992; Hintze 1992). As indicated in table 3, the structure main effect 
is significant at p < .01 in six of the 19 judgment areas listed, and at p < .10 in nine of the 19 areas 
(one-tailed). The nine tasks for which the structure variable is significant are MPR, MPM, ACC, 
RSI, IDC, PAP, DSS, PST, and AEA (see task abbreviation key in table 1). Strongly consistent 
with Hla, the mean required experience levels for 16 of the 19 dependent variables, and for all 
nine of the dependent variables for which structure is significant, are lower for structured firms 
than forunstructured firms. For these nine tasks, the average difference in the dependent variables 


between structure categories is .46, corresponding to approximately a six-month mean experience 
difference.!^!5 


PSince the MANOV As reported here involve only two independent variables (structure and environmental complexity), 
the significance levels reported are derived from exact F-ratios. Thus the four most commonly used statistics in 
multivariate means analysis—Wilks’ Lambda, Hotelling-Lawley trace, Pillai's trace, and Roy's largest root—yield 
identical p-values. Accordingly, only Wilks' Lambda is reported. 

^MANOVAs were also estimated within each environmental condition to investigate whether interpretation of the 
structure results is conditional on environment. While the structure effect is clearly larger in the higher complexity 
condition (Fy, |,,,3.19, p<.001-—see discussion of structure/environment interaction below), the multivariate structure 
effect in the fower complexity condition is also significant (F 4 .,=1.80, p=.04). Univariate ANOVAs indicate that the 
structure variable is in the expected direction in 11 of the 19 dependent variables for the lower complexity condition, 
and in three of four of the significant ANOVAs. 

15 Similar analysis using experience-level responses adjusted for between-firm promotion differences (i.¢., “transformed 
responses") indicates that structure is significant in 14 of the 19 task areas. Differences between structure categories are 
in the expected direction for all 14 significant variables. Unstructured firm experience levels are reliably higher in both 
environments for nearly all of the 19 variables. 
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Task-level analysis. While this study’s structure ratings indicate that a firm-level structure 
measure is likely to be reliable as an independent variable for examining firm-level dependent 
constructs, such an approach cannot be used to investigate the possibility that observed effects are 
due to exogenous firm-level variables that are confounded with firm-level structure measures. 
Accordingly, average task-level structure ratings for the “unstructured” firms were subtracted 
from the average task-level structure ratings for the "structured" firms, resulting in a vector of 
differences in task-level ratings. Experience responses for the performance of audit tasks were 
averaged within structure category and were then differenced across structure category for each 
of the 19 tasks, forming a vector of differences in task-level average required experience levels. 
The structure differences vector was then correlated with the experience differences vector. 

Results from the correlation analysis for performance responses indicate a significant 
positive relationship between task-level structure difference and experience-requirement differ- 
ence (Spearman correlation = .56, n= 19, p=.01). In other words, large (small) differences in task- 
level structure ratings are associated with large (small) differences in task-level experience 
requirements in the expected direction. Consistent with H1a, these results provide evidence that 
task-level audit structure is related to differences in required minimum experience level for the 
performance of judgment-oriented audit tasks, indicating that observed differences across 
structure category are not solely attributable to an exogenous, firm-level mediating variable.!ó 


Structure and Experience levels for Initial Supervision/Review of Judgment Tasks (H1b) 


H1b predicts that, on average, structured firm managers will assign lower experience levels 
than will unstructured firm managers for the supervision/review of audit tasks. A MANOVA 
model was estimated with required minimum experience levels for the supervision/review of the 
19 judgment-oriented audit tasks as dependent variables, and firm-level structure category and 
environmental complexity as independent variables. The effect of structure on the multivariate 
means vector is highly significant (p< .0001— see table 4). In all 12 tasks for which the 
(univariate) structure effect is significant (OKB, MPR, MPM, ACC, IDC, PAP, DSS, PST, ERS, 
MER, AEA, and DAR), structured firm managers indicated a lower mean rank level than did 
unstructured firm managers for the initial supervision/review of that task (see table 5). For these 
12 tasks, the average difference in the dependent variables between structure categories is .50, 
corresponding to approximately a six-month mean experience difference.!” 

Task-level analysis. Similar to the task-level analysis for performance responses, average 
experience responses for the initial supervision/review of audit tasks were differenced across 
structure category for each of the 19 tasks, forming a vector of differences in task-level average 
required experience levels. This vector was then correlated with the vector of task-level structure 
differences described previously. Similar to the task-level results reported for performance 
staffing assignments, the difference in average supervision/review responses across firm category 


6 The task-level correlation analysis for performance experience levels was repeated using “transformed” experience- 
level responses in place of rank-level responses in the dependent variable. The results of this analysis are quabtalivey 
identicai to those reported in the paper (Spearman correlation coefficient = .61, p < .01). 

U The analysis for supervision/review was not repeated with the dependent measure adjusted for between-firm onodi 
differences, as was done for task performance experience measures, for two reasons. First, most of the responses for 
initial supervision/review center around the manager level or above. Promotion differences between firms are manifest 
predominantly at the staff level. Additionally, data on promotion differences was only gathered up to the manager level, 
and thus ‘extrapolation of differences at low rank levels to higher levels would be questionable. Second, the structure 
effect using unadjusted rank levels proved to be strongly in the predicted direction for the initial supervision/review of 
the tasks. Transforming the rank-level responses to years-of-experience responses would likely strengthen the already 
highly significant results, yielding little qualitative difference. 


* 
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TABLE 3 


Average Experience-Level Assignments’ for the Performance of 19 Judgment- 
Oriented Audit Tasks—Univariate ANOVA Results by Structure and Environment 


Structure ae 

Structured Firms Unstructured Firms Main Effect Sir/Env. 
Audit , Lower Higher Lower Higher t-Statistic Interaction 
Task Complexity Complexity Complexity Complexity — (One-tailed) F-Statistic 
OKB 5.06 5.04 4.89 5.62 1.04 3,64** 
PPA 4.43 4.69 4.32 4.88 0.28 1.12 
MPR 5.25 5.24 5.31 5.89 2.02** 2.93% 
MPM 5.37 5.47 5.68 5.71 1.34* 0.03 
ACC 4.36 4.51 4.71 5.34 3,66** 1.56 
RSI 3.27 3.32 3.46 4.15 3.03** 3,52* 
IDC 4.59 4.57 4.83 5.20 2.64** 1.34 
DSA 4.51 4.67 4.20 4.77 0.81 2.43 
DST 4.28 441 4.12 4.68 0.37 2.29 
PAP 4.07 4.20 4.22 4.69 2,53** 1.89 
DSS 3.72 4.04 4.37 4.74 3.90** 0.02 
PST 4.49 4.50 4.90 5.31 3.79** 1.53 
ARS 3.78 4.01 3.80 4.03 0.10 0.00 
ERS 4.22 4.25 4.29 4.54 1.06 0.39 
RAP 5.83 5.66 5.82 5.8] 0.32 0.16 
MFR 6.57 6.54 6.52 142 0.93 1.18 
AEA 5.41 5.10 © $17 5.91 1.51* 7.78** 
DAA 6.70 7.12 6.58 6.86 0.69 0.07 
DAR 4.88 5.01 5.02 5.20 0.92 0.02 


AVG. 4.78 4.86 4.86 5.29 


(P-values of .05 or less are highlighted with double asterisks, p-values between .10 and .05 with a single asterisk. Significance 
levels for t-statistics are one-tailed.) 


Key to Task Abbreviations: 

OKB = Obtain Knowledge of Business DSS = Determine Substantive Sample Approaches, Sizes 
PPA = Perform Preliminary Analytical Review PST = Prepare Staffing & Time Budgets 

MPR = Make Preliminary Risk Assessment ARS = Aggregate Results of Substantive Tests 

MPM = Make Preliminary Materiality Assessment ERS = Evaluate Results of Substantive Tests 

ACC = Assess Client’s Control Environment RAP = Review Audit Plan for Needed Modifications 
RSI = Review Systern of Internal Controls MFR = Make Final Review of Financial Statements 
IDC = Identify and Document Critical Audit Areas , AEA = Aggregate & Evaluate Audit Results 

DSA = Design Substantive Analytical Review Procedures DAA = Decide on Appropeiate Audit Opinion 

DST = Design Substantive Tests of Details DAR = Draft Audit Report 


PAP = Prepare Audit Programs 
*Experience-level coding: 1 = Staff, ,, 2 = Staff, p 3 = Stafi p 4 = Senior, ,, 5 = Senior,_,, 6 = Senior, „ 7 = Manager, ,, 


8 = Manager, ,, 9 = Manager „ 10 = Senior Manager, ,, 11 = Senior Manager, „ 12 = Senior Manager,,, 13 = = Partner, ,, 
14 = Partner, ,, where su indicate time-in-grade i in one-year increments. 
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TABLE 4 
MANOVA: Effects of Structure and Environment on Auditor Experience Assignments 
for the Initial Supervision/Review of 19 Judgment-Oriented Tasks 
MANOVA Model: Yg 7 lt 0, B, Ty 64 


where: y, is a vector of responses in 19 dependent variables, œ is the effect of A 
(structure) on each of the 19 variables in Yu is the effect of the jth level of B 
(environment), and y, is the AB interaction effect. 


Test 
Statistics 
Value Numerator Denominator F-Test 
(Wilks' Degrees of Degrees of Probability 
Variable Lambda) Freedom Freedom F-Ratio Level (Alpha) 

Structure 0.78 19 277 4.18 0.0000 
Complexity — 0.91 19 277 1.49 0.0874 
Interaction 0.92 19 277 1.35 0.1544 


is positively correlated with the difference in task-level structure rating in the expected direction 
(Spearman correlation = .50, n = 19, p < .03). These results provide evidence that the observed 
effect predicted in H1b is attributable to task-level audit structure. 


Structure and Environment (H2) 
Structure-by-Environment Interaction for Performance of Judgment Tasks (H2a) 


H2a predicts that the experience differences predicted in H1a will be moderated as increased 
environmental complexity increases the likelihood of a diminished structure/environment fit. As 
indicated in table 2, the multivariate structure-by-environment interaction effect for the perfor- 
mance of judgment-oriented audit tasks is statistically significant (E,,,,,, = 1.51, p = .08). 
However, contrary to H2a, in 16 of the 19 univariate ANOVAs, unstructured firm managers 
increased the required experience level for performance of the given task by a greater degree than 
did structured firm managers as environmental complexity increased. Four of these 16 "contrary" 
univariate interactions are significant at the .10 level, indicating that these four variables may be 
influential in terms of the multivariate results (OKB, MPR, RSI, and AEA). The interaction term 
is not significant for any of the three variables for which results are not in the predominant 
direction. As seen in table 3, unstructured firms’ performance experience levels increase across 
nearly all tasks in the higher complexity environment, while structured firms’ experience levels 
change relatively little. To explore possible explanations for the interaction results, data from the 
exit survey were further analyzed. 

Additional analysis on performance interaction. A possible explanation for the apparent 
lack of response to the complexity manipulation on the part of structured firm managers is that 
reliance on structured approaches causes structured firm managers to be less sensitive to 
qualitative environmental conditions (e.g., see Dirsmith and McAllister 1982; CL 1986; Bamber 
et al. 1989; Dirsmith and Haskins 1991; Libby and Luft 1993). However, this explanation can be 
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ruled out by responses to the exit survey. In the lower complexity condition, there was not a 
significant difference between structured and unstructured firm managers' perceptions of the 
complexity of the audit environment relative to most audits (4.3 and 4.4 respectively, t — .40). 
However, in the higher complexity condition, structured firm managers perceived the environ- 
ment to be more complex than did unstructured firm managers (6.1 and 5.5 respectively, t=2.4, 
p<.02; interaction significant at p«.05). Structured firm managers were clearly not less sensitive 
than unstructured firm managers in perceiving the environment manipulation. 

While structured firm managers generally did not increase required experience levels for 
individual tasks, the exit survey provides evidence that structured firm managers responded to 
environmental complexity by increasing their reliance on specialists. In response to an exit survey 
question, structured firm managers indicated a significantly higher likelihood of using specialists 
in the higher complexity environment than in the lower complexity environment (6.4 and 5.6 
respectively on aten-point scale; t21.85, p=.07—two-tailed). Unstructured firm managers did not 
indicate a different likelihood of using specialists in the higher complexity environment than in 
the lower complexity environment (4.9 and 5.1 respectively, on a ten-point scale; t = .53, p=.60— 
two-tailed). This response by structured firm managers can be seen as a move to enhance 
experience levels in complex environments by bringing to bear the expertise of available 
specialists rather than by increasing the experience levels of regular audit team members (see 
Taylor 1995). Along these lines, results from other studies indicate that auditors from structured 
firms increase consultation with peers and superiors to a greater extent than do unstructured firm 
auditors as environmental uncertainty increases (e.g., Bamber and Snowball 1988). These results 
suggest potentially important differences in how structured and unstructured firm managers 
respond to complexity in the audit environment. 


Structure-by-Environment Interaction for Initial Supervision/Review of Judgment Tasks (H2b) 


With regard to H2b, the MANOVA results do not indicate the presence of a significant 
structure-by-environment interaction for the initial supervision/review of the audit tasks (see 
table 4—F,,, 7 1.35, p.16). Table 5 shows a significant but fairly uniform increase in required 
supervision/review experience levels between the lower and higher complexity environments for 
both structured and unstructured firm managers (multivariate environment main effect signifi- 
cant at p= .09, two-tailed—see table 4).! 


Structure and Consistency in the Allocation of Human Resources (H3) 


It was predicted in H3a and H3b that structured firm managers would be more consistent in 
their allocation of human resources to various audit tasks than would unstructured firm managers 
due to more explicit staffing guidelines and more consistently defined task requirements. Exit 
survey responses provide indirect evidence supporting this premise: structured firm managers 
perceived that their staffing assignments were influenced more by their firms' audit guidance 
materials than did unstructured firm managers (7.4 and 6.4, respectively, on a 10-point scale; t = 
4.04, p « .001). 

To test H3a, the variance in task assignments for performance was computed across auditors 
within each of the two firm-level structure categories for each audit task. In 15 of the 19 tasks, 
unstructured firm managers' responses are more variable than those of structured firm managers. 
Variances are significantly different between groups for eight of the 19 dependent variables (PPA, 


18 Similar to the Analysis of Covariance performed for the performance responses, the univariate ANOVAs were rerun 
for supervision/review, using experience level of the respondent as a covariate. Again, the covariate had little effect on 
the structure or environment results. 
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TABLE 5 


Average Experience-Level-Assignments' for the Initial Supervision/Review of 
19 Judgment-Oriented Audit Tasks—Univariate ANOVA Results 


by Structure and Environment 
Structure 
Structured Firms Unstructured Firms Main Effect — Str/Env. 
Audit Lower Higher . Lower Higher t-Statistic . Interaction 


Task Complexity Complexity Complexity Complexity (One-tailed) F-Statistic 





OKB 7.86 8.02 8.04 8.68 1.78** 1.04 
PPA 741 7.78 7.53 7.66 0.17 0.25 
MPR 8.40 8.22 8.95 9.13 2.86** 0.47 
MPM 8.61 8.74 9.33. . 9416 2.08** 0.28 
ACC 7.46 7.59 8.04 8.28 3.20** 0.08 
RSI 6.51 6.52 6.40 6.79 0.44 0.92 
IDC 7.55 786 8.07 8.00 l64** 0.86 
DSA 7.50 7.62 7.49 7.66 0.10 0.02 
DST 7.31 7.39 7.39 7.46 0.30 0.02 
PAP 7.12 © 738 7.34 7.63 1.42* 0.01 
DSS 6.92 7TH 7.63 7.68 3.06** 0.13 
PST 7.60 7.78 8.22 8.47 3.03** 0.02 
ARS 6.85 7.33 6.98 7.02 0.49 148 
ERS 7.04 7.33 7.58 7.54 1.80** 0.59 
RAP 9.31 9.16 9.76 8.69 0.00 241 
MFR 10.09 9.82 10.52 10.99 2.61** 1.49 
AEA 8.58 8.39 8.95 9.08 2.01** 0.35 
DAA 10.46 10.83 10.27 10.83 0.30 0.09 
DAR 8.11 8.58 8.69 8.67 1.35* 0.99 
AVG 7.94 8.08 &.27 8.39 


(P-values of .05 or less are highlighted with double Aeris between .10 and .05 with a single asterisk. Significance levels 
for t-statistics are one-tailed.) 


Key to Task Abbreviations: 

OKB = Obtain Knowledge of Business DSS -Determine Substantive Sample Approaches, Sizes 
PPA = Perform Preliminary Analytical Review PST = Prepare Staffing & Time Budgets 

MPR = Make Preliminary Risk Assessment . ARS = Aggregate Results of Substantive Tests 

MPM = Make Preliminary Materiality Assessment ERS = Evaluate Results of Substantive Tests 

ACC = Assess Client’s Control Environment RAP = Review Audit Plan for Needed Modifications 
RSI = Review System of Internal Controls MFR = Make Final Review of Financial Statements 
IDC = Identify and Document Critical Audit Areas AEA = Aggregate & Evaluate Audit Results 

DSA = Design Substantive Analytical Review Procedures DAA = Decide on Appropriate Audit Opinion 

DST = Design Substantive Tests of Details DAR = Draft Audit Report 


PAP = Prepare Audit Programs 


*Experience-level coding: 1 = Staff, 2 = Staff, .. 3 = Staff,,, 4 — Senior, ,, 5 = Senior, „ 6 = Senior ux E 
8 = Manager, 9 = Manager „ 10 = Senior | Manager, » 11 = Senior Manager, 2 12 = sei VA , 13 = 
Partner, ,, 14 = Partner „ where subscri ipts indicate time-in-grade in one-year increments. 
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MPR, ACC, RSI, IDC, ERS, RAP, AEA). Consistent with H3a, in all eight of the tasks for which 
the differences are significant, unstructured firm managers’ responses exhibit higher variance. 

This procedure was repeated using supervise/review experience responses. Results are 
similar to those for the performance of audit tasks. Unstructured firm managers’ responses are 
more variable than those of structured firm managers in 13 of the 19 tasks. Variances are 
significantly different between groups for nine of the 19 tasks (OKB, PPA, MPR, MPM, ACC, 
DST, PST, ERS, DAR). Supporting H3b, in seven of the nine tasks for which the differences are 
significant, unstructured firm managers’ responses exhibit higher variance. 


Task-level analysis 


To rule out the possibility that an extraneous firm-level variable is driving these results, 
correlations between average task-level structure rating differences (across structure category) 
and differences in the size of the standard deviations of the responses were obtained. For 
performance responses, the Spearman correlation between structure and standard deviation 
differences (.43, n = 19) is significant at p < .07. For supervision/review responses, the Spearman 
correlation between structure and standard deviation differences (.52, n = 19) is significant at 
p < .03. Thus, consistent with H3a and b, results indicate that large (small) differences in task- 
level structure ratings are associated with correspondingly large (small) differences in the 
variance of responses across structure category, in the expected direction. In sum, these results 
are consistent with the prediction that structured approaches yield human resource assignments 
that are less variable across auditors. 


VL SUMMARY, CONCLUSIONS, AND FUTURE RESEARCH DIRECTIONS 


This study affirms the existence of systematic and pervasive differences in the use of 
structured approaches among large auditing firms. However, the study also demonstrates the 
existence of structure variation across tasks within firms, and highlights the potential role of task- 
level structure measures in eliminating possible firm-level alternative explanations (also see Dilla 
and Stone 1994). Overall, this study's findings indicate that audit structure continues to be a 
relevant and potentially informative construct from a theoretical perspective, and a critical 
decision variable from a practical perspective. 

In this experiment, the structuredness of a particular audit approach at the task level impacted 
the experience level required of both the personnel assigned to perform and to initially supervise/ 
review the associated judgment-oriented audit task in both lower and higher complexity 
environments. To the extent that these approaches are similar in effectiveness, this result suggests 
that human resource efficiency may be an important explanation for increased reliance on 
structured approaches over the past decade. However, conclusions regarding the relative “full 
equilibrium" efficiency of structured approaches depend on the effectiveness of such approaches 
and on the full set of costs and benefits involved in such areas as development and implementa- 
tion, marketing, training, litigation, etc. Thus, these issues are important topics for future research. 

The direction of the structure-by-environment interaction for the performance of judgment 
tasks proved to be opposite of that predicted with respect to the experience levels of regular audit 
team members. While the environmental complexity manipulation arguably increases the 
likelihood of a decreased structure/environment fit for the set of tasks included in this study, the 
degree of the mismatch caused by the manipulation is difficult to assess. Thus, the possibility that 
the manipulation was not strong enough to cause a perceived structure/environment mismatch on 
the part of structured firm managers must be acknowledged. However, it should be noted that 
unstructured firm managers did significantly increase experience levels in response to the 
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manipulation. In addition, the available evidence suggests that structured firm managers re- 
sponded by enhancing the expertise of the audit team through increased reliance on specialists. 
The structure-by-environment interaction for the initial supervision/review of judgment tasks 
was not significant, though both structured and unstructured firm experience levels increased in 
response to the higher complexity environment. 

These results are potentially informative with respect to differences in how structured and 
unstructured firm managers respond to environmental complexity, and suggest questions for 
future research. Do managers make appropriate structure/experience trade-offs in terms of task 
effectiveness? What is the impact of environmental complexity on the appropriateness of these 
trade-offs, and what role do specialists play? What are the determinants of human resource 
allocations in terms of the time and experience required to complete an audit (e.g., see 
Hackenbrack and Knechel 1995; Prawitt and Spilker 1995)? In view of the facts tbat human 
resources represent the most costly input in the auditing process and that the authoritative 
literature recognizes audit staffing as a critical audit judgment, the scarcity of research on these 
issues suggests potentially significant research opportunities. 

Individual tests of differences in variance across structure category provide evidence 
consistent with the notion that structured audit approaches lead to greater consistency in tbe 
assignment of human resources. Such consistency is likely to have implications forthe nature and 
timing of the experience gained by auditors and for other important organizational characteristics, 
including the nature and timing of training, within-firm mobility, and advancement. For example, 
a finding from the exit survey indicates that structured-firm staff auditors are promoted at more 
standard intervals than are unstructured-firm staff auditors (Fis 159) = 1.34, p = .05). 

Finally, additional research might be productive in assessing the effects of audit structure in 
such areas as auditor socialization, training, and expertise. For example, an important question 
is whether structured decision aids facilitate the transfer and understanding of high-level 
knowledge to users and as such serve as learning tools, or whether such aids encourage the 
mechanistic application of poorly understood routines, thereby impeding the formation of 
sophisticated mental representations (e.g., Glover et al. 1995; Hyatt 1995). 

Despite the measures taken to make the experimental task as realistic as possible and to 
mitigate possible noise and bias problems, conclusions based on the results of this study are 
subject to limitations. First, the structure ratings are subject to some of the caveats noted by CL 
(1986). For example, the rating process was subjective in nature, was limited to materials 
submitted by the firms, and included only verbal descriptions of electronic tools used by the firms. 
Second, while ex-post checks do not indicate the presence of bias, the nature and scope of the 
experiment did not allow for complete researcher control at all phases of data collection. Third, 
generalizations to practice are limited by the assumption that responses realistically reflect human 
resource allocation decisions in practice. However, to the extent that no biases are induced across 
structure category, these considerations do not hinder interpretation of the hypothesis tests. 
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The Objectivity of Accountants’ 
Litigation Support Judgments 
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ABSTRACT: This study examines accountants’ objectivity when serving as a 
litigation specialist and expert witness on legal cases. Its purpose Is to determine 
whether professional objectivity, In generating a fair and unbiased accounting 
estimate, will be influenced by the possible conflicts of interest inherent in the 
litigation support role. Employing litigation specialists and auditors from two firms, 
and the Defining Issues Test as a psychometric for practitioners’ ethical reasoning, 
this research examines client advocacy on a damage valuation experiment manipu- 
lating the client's legal position. Results show that domain-specific experlence 
coupled with ethical reasoning reduces the extent of bias or "side taking" In litigation 
support judgment. 


Key Words: Objectivity, Ethical reasoning, Litigation support, Expert witness. 


Data Availability: Data and experimental materials will be made available by the 
author upon written request. 


I. INTRODUCTION 


itigation support services provide real value to the client-attorney relationship because 
CPAs help win lawsuits and earn settlements. It is not clear, however, how litigation 
specialists in public accounting firms balance their requirement for objectivity against the 
demands of client advocacy in extant practice. The present paper sheds light on this issue by 
examining the litigation specialist's orientation toward objective practice when generating a 
damage valuation report for a client in a lawsuit. This study tests the influence of client advocacy 
on litigation support judgment by experimentally examining the effects of the client's legal 
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position—as either defendant or plaintiff—in a civil lawsuit pertaining to a disputed insurance 
claim. Because objectivity is an ethical construct, the theory of ethical reasoning is used to 
understand more clearly how the litigation specialist's ability to provide objective testimony is 
related to his or her capacity to frame, process and resolve ethical conflict within the professional 
domain. 

A between-subjects experimental design was utilized, employing a final sample of 101 
litigation support specialists and 106 auditors with no prior background or experience in the 
litigation services area, ranging from staff to partner levels in two public accounting firms. The 
experiment required individuals to calculate and report the damage valuation of an organization's 
physical inventory that was destroyed in a fire. In addition, subjects completed the full six story 
version of the Defining Issues Test (DIT), a well known and reliable psychometric of ethical 
reasoning (Rest 1979b). 

One primary and two subsidiary hypotheses are advanced concerning the objectivity of 
expert testimony produced by litigation support specialists in public accounting firms. The 
primary hypothesis suggests that litigation specialists in accounting will seek to accommodate the 
client by rendering a conservative (low) accounting estimate if the client wishes to avoid damages 
and an unconservative (high) accounting estimate if the client seeks to prove damages. In this 
regard, no bias exists if the accountant's estimate in each instance is exactly the same. The first 
subsidiary hypothesis predicts that accountants with higher ethical reasoning skills are more 
likely to uphold objectivity based on their intrinsic principles, irrespective of client advocacy 
pressure to do otherwise. The second subsidiary hypothesis predicts that accountants at higher 
positions should favor client advocacy over objectivity, given that their role is more closely linked 
to their ability to win cases or earn settlements for their clients in a lawsuit. 

Research findings corroborate the primary hypothesis by showing that the estimated value 
of damages was higher for individuals who were told that they represented the plaintiff (e.g., a 
business firm suing its insurance company for the fair proceeds of a casualty claim) rather than 
the defendant (e.g., the insurance company) in a lawsuit. Domain-specific experience and DIT 
scores were found to be related to the extent of bias in damage valuations, where between-subject 
differences in accounting estimates provided by litigation specialists tended to converge with the 
results of the control group at higher experience levels. This pattern of findings, however, was not 
found for the sample of auditors. 

The remainder of this paper is organized into four general sections. The next section iouis : 
a brief overview of litigation services in public accounting firms and a more general discussion 
of the actual and perceived objectivity of expert testimony on legal cases. The theory of ethical 
reasoning is used to explain the objectivity paradigm in the present paper, as well as the primary 
. and subsidiary hypotheses tested in this research. Section III provides an overview of experimen- 
tal methods employed in this research and section IV provides a summary of results. The final 
section concludes the paper with a discussion of the implications and the limitations of the 
research and a proposal for future work in this area. 


H. BACKGROUND AND HYPOTHESIS DEVELOPMENT 


Objectivity and Expert Testimony 


According to professional standards, objectivity is an essential part of both the accounting 
and auditing functions. In this context, objectivity is defined as a mental attitude that permits the 
individual accountant or auditor to fulfill professional responsibilities without compromising 
judgment or ethical beliefs or yielding to the demands of others within and outside the 
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organization (Mautz and Sharaf 1961).! The need for objectivity over client advocacy in the 
litigation support arena is succinctly stated in Technical Practice Aid No. 7 published by the 
Management Advisory Services division of the AICPA (Wagner and Frank 1986, 6): 


As an expert witness, the CPA presents opinions publicly in an objective fashion, but as 
a consultant the CPA advises and assists the attorney or client in private [emphasis 
added]. 


Epstein and Spalding (1993, 196) note that the Principles of the AICPA’s (1991) Code of 
Professional Conduct apply to accountants who serve as litigation support specialists and expert 
witnesses and, by virtue of Rule 102 on integrity and objectivity, are required to “. . . place the 
obligation to the public ahead of any obligation to a particular plaintiff or defendant as a result 
of the engagement." 

While litigation support is not considered as attestation engagement, CPAs who provide 
litigation support services for clients and attorneys are required to adhere to the AICPA’s (1991) 
Statement on Standards for Consulting Services Number 1 . In addition, the AICPA offers specific 
guidance on professional and ethical issues when providing litigation support services in two 
recent special reports, Application of AICPA Professional Standards in the Performance of 
. Litigation Services (AICPA 1993a) and Conflicts of Interest in Litigation Services Engagements 
(AICPA 1993b). These standards and guidelines clearly indicate the critical importance of 
integrity and objectivity, professional competence and due professional care when performing 
litigation support engagements. 

The applied psychology literature suggests that maintaining a truly objective point of view 
at all times may be impossible for members of a profession who face social and economic 
pressures (Appelbaum 1987; Greenberg and Wursten 1988; Howell 1990; Otto 1989; Wedding 
1991). Although the testimony produced by expert witnesses has been repeatedly shown in 
empirical studies to have a powerful influence on juror perceptions and behavior (see Reinard 
1988, fora summary of major works in this area), studies in forensic psychology (Wedding 1991; 
Williams 1992), psychiatry (Greenberg and Wursten 1988; Zonana 1984) and medicine (Marcus 
1985; Nussbaum 1985) have found that individuals serving as expert witnesses in civil and 
criminal cases may provide relatively biased testimony that results from a willingness to advocate 
the best possible legal position for clients and their attorneys. 

For example, Otto (1989) examined the objectivity of expert testimony for a sample of 32 
advanced doctoral students in clinical psychology by using hypothetical court cases (e.g., 
criminal and civil) and two experimental manipulations concerning the legal position of the 
expert as either being retained by the plaintiff or defendant. According to the author, findings of 
this work show that (1989, 267) ". . . [subjects] were generally more sympathetic to the party that 
employed them" [emphasis added], especially for criminal court cases. In a related study of mental 
health workers who served as expert witnesses on child welfare and custody cases, Howell (1990, 
15) found that “Objectivity . . . can best be achieved when the psychologist is appointed by the 
judge as a friend of the court." 

Another issue concerning the study of expert witnesses has focused on differences in the 
perceived versus the actual reliability of expert witness testimony (Boyll 1991; Jackson 1988; 
Kaplan and Lynn 1978; Kassin et al. 1989; Kassin and Wrightsman 1988; Pennington and Hastie 
1990; Wagenaar 1988). For example, Jackson (1988) found that both lawyers and psychiatrists 


! Morgan (1988) advances an alternative viewpoint, arguing that objectivity in extant accounting and auditing practice 
is essentially a myth that hinders the advancement of the profession. 
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were more confident than laypersons about the treatability of the alleged criminals’ mental 
disorders, and that such confidence in judgment (1988, 215) “. .. may bias courts toward their 
[experts'] opinions.” Although social science testimony is not inherently biased, Wagenaar 
(1988) argues that, by virtue of framing effects, members of the court (e.g., jurors, judges and 
attorneys) often may misinterpret the information on which expert opinions are based. The same 
may be tme in the auditing setting, where jurors and the judiciary may have unrealistic perceptions 
of practice (Jennings et al. 1991; Anderson et al. 1993; Lowe and Reckers 1994), as well as the 
role of professional standards (Buckless and Peace 1993). 

The above studies raise some doubt about whether psychological or medical experts can 
maintain their objectivity when the client who retains them is affected by the outcome of the 
litigation. This problem also pertains to accountants who are often required to serve clients and 
their attorneys in a consulting capacity and then are called to stand as an expert witness in the 
courtroom (Collier 1989; Zipp 1992; Crain et al. 1994). While most accounting firms segregate 
litigation support from attest engagements within the practice office (Bridger 1992; Wallace 
1992), the general public may be unable to distinguish between these two very different roles for 
public accounting professionals. Hence, a belief that accountants do not exercise objectivity on 
litigation support engagements can lead to a more general perception of diminishing ethical 
propriety in other areas of practice (Epstein and Spalding 1993; Previts 1986). Beyond the 
appearance issues, professional standards (Wagner and Frank 1986; AICPA 1991) explicitly 
require accountants and auditors to exercise a high degree of objectivity when performing non- 
audit services for client organizations—especially when serving as an expert witness on 
accounting issues in the courtroom (see AICPA 1993a, 1993b).? 


Theory of Ethical Reasoning 


The psychology of ethical reasoning provides theories that explain the decision making 
process by which individuals recognize, reason and resolve ethical conflicts such as those created 
in the litigation support field. This area of psychology is based on Kohlberg’s (1969) stage- 
sequence model that defined a series of cognitive levels and stages somewhat akin to the rungs 
of a ladder. That is, all individuals move upwardly through these developmental levels beginning 
at what is termed “pre-conventional morality,” to the second level termed “conventional 
morality,” and sometimes to the final and highest level called “post-conventional morality." 

To paraphrase Kohlberg (1984, 624—639), the three levels can be understood as three 
different types of relationships between the self and society's rules and expectations. To a pre- 
conventional person, rules and social expectations are something external to the self; a conven- 
tional person identifies self in relation to others; a post-conventional person differentiates the self 
from the rules and expectations of others and defines his or her values in terms of self-chosen 
principles. To the pre-conventional person, resolution of an ethical dilemma is simply based upon 
theimmediate cost and/or benefit of ethical action. To the conventional person, resolution is based 
upon the avoidance of harm to others belonging to one's social institution. The post-conventional 
person frames an ethical judgment based upon an internalized, self-chosen set of principles. 

Since its inception, much research has been conducted to validate the stage-sequence model 
of ethical development (see Ponemon and Gabhart 1994 for a summary of empirical studies in 


? According to an ethics interpretation of the AICPA (1989, 170), providing service as an expert witness at the request 
of a client constitutes an “engagement” in the practice of public accounting. Thus, litigation support specialists in public 
accounting firms are obligated to uphold all relevant professional promulgations on public accounting practice. 
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accounting and auditing). The results of this research suggest that members of the public 
accounting profession are not reaching their potential for higher levels of ethical reasoning by 
virtue of selection-socialization in education and within public accounting firms. In addition, 
findings show that ethical reasoning is an important determinant of professional auditing 
judgments such as the disclosure of sensitive information, independence and fraud detection. 
Results also indicate that unethical or dysfunctional behavior may be systematically related to the 
ethical reasoning level of the professional accountant. 


Primary and Subsidiary Hypotheses 


Following from the above mentioned studies by Otto (1989), Howell (1990) and Wedding 
(1991) on the objectivity of expert testimony, the primary hypothesis tests the influence of client 
advocacy on the professional accountant's litigation support judgment.‘ 


H, Accountants’ estimates of damages will be influenced by the client's legal position, 
where accounting estimates will be larger for clients seeking to prove damages and 
smaller for clients seeking to evade damages. 


Drawing on the findings of Ponemon (1990, 1992a, 1992b), Ponemon and Gabhart (1990), 
Arnold and Ponemon (1991), Lampe and Finn (1992) and Shaub (1993), accountants and auditors 
with relatively high reasoning levels (as measured by the Defining Issues Test) are more likely 
to uphold ethical standards such as objectivity based on the individual's intrinsic principles, 
irrespective of business-related pressures to do otherwise. On the other hand, those with 
relatively low levels of ethical reasoning are mostly concerned about maximizing self-interest or 
avoiding punishment and are more likely to shirk on ethical standards such as objectivity 
(especially when the costs of noncompliance are perceived to be low or nonexistent). 


H, Accountants’ ethical reasoning (as measured by the DIT P score) will be positively 
related to the objectivity of accounting estimates. 


Even though research findings concerning the influence of experience on the objectivity of 
expert testimony are generally mixed (Jackson 1988; Marcus 1985; Nussbaum 1985; Wagenaar 
1988; Wedding 1991; Zonana 1984), the present study advances the idea that over time 
individuals in the litigation support field develop the skills needed to win cases for their clients. 
In this regard, the most valuable expert witnesses in public accounting are those who have 
developed the ability to render defensible positions that maximize the interests of particular 


* Kohlberg (1984) and Rest (1986) provide a summary of the results of hundreds of studies in this area and also address 
the most salient critiques to the theory including the stability of moral judgment stages, the existence of possible gender 
bias and the possibility of measurement error in psychometric methods. 

* Support for this hypothesis does not mean that objectivity is purely an ethical construct for litigation specialists in public 
accounting firms. Rather, many other factors, such as incentives and cognitive framing effects, play an important part 
in determining the extent of bias provided by accounting experts when serving their clients in an advocacy position. For 
instance, Johnson (1993) found that tax accountants favored advocacy over objectivity when interpreting evidence of 
income and expenses to justify a client's tax return position. More recently, Cuccia et al. (1995) found that experienced 
tax accountants tended to take an aggressive tax position for their clients when making inferences about the application 
of verbal standards or when assessing the client's supporting documentation to the income tax return. 

5 The notion that ethical behaviors such as objectivity and independence are “role prescribed" is developed by Gaa (1993). 
This argument is consistent with the theory of moral expertise also advanced by Gaa (1993) and recently tested by Gaa 
and Ponemon (1994). This theory goes well beyond the ethicel reasoning framework by showing that "superior" ethical 
judgment is influenced by a configuration. of ethical reasoning skills and domain-specific experience within the 
professional domain. 
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clients or their attorneys (see Collier 1989; Wagner 1990). However, litigation is a costly process, 
and too much bias in judgment may lessen the credibility and value of the expert’s testimony. 
Thus, in some instances it is possible that highly expert litigation specialists will avoid extreme 
positions when testifying on an accounting issue to signal their competence to the judge and jury. 
Notwithstanding this possibility, however, the second subsidiary hypothesis suggests that more 
experienced litigation support specialists take a stronger client advocacy position and, therefore, 
will be more likely to produce biased, but defendable, expert testimony than less experienced 
accountants. 


H, Accountants’ experience in providing litigation support will be inversely related to the 
objectivity of accounting estimates. 


III. METHODS 


Subjects 


The partners in charge of the offices of several international public accounting firms located 
in the northeastern region of the United States were contacted by the researcher. The culmination 
of this effort was the agreement by two firms to participate in this study. With the assistance of 
firm management, professional accountants in the two firms were asked (on a voluntary basis) to 
complete an experiment dealing with the calculation and estimation of an organization's physical 
inventory that was destroyed in a fire. Subjects in both firms completed all experimental 
materials, either individually or in small groups (of fewer than four people), under the direct 
supervision of the researcher. l 

By design, the sample included individuals in the litigation support area as well as auditors 
who did not have prior experience in litigation support. Litigation specialist levels ranged from 
staff to partner, and all litigation support subjects had experience in providing litigation support 
services for the firm’s clients. Auditors also ranged in experience from staff to partner ranks. The 
experience levels of participating subjects in this study were deemed to be appropriate because 
the experiment (described below) required individuals to perform damage calculations, which is 
typically handled by staff or seniors and reviewed by managers and partners. From this evidence, 
they prepare the expert testimony to be furnished in depositions and the courtroom. 

The reason for selecting auditors who did not possess litigation experience was to examine 
the influence of domain-specific experience, rather than public accounting experience alone, on 
accountants’ litigation support judgment. This distinction is necessary because, as noted by Patel 
and Green (1991), there may be important differences in the decision making abilities of experts 
in a given domain and sub-experts in a closely related domain. In this regard, litigation specialists 
served as domain experts, and auditors as sub-experts, in the present experiment. With the 
assistance of administrative personnel in the two firms, auditors were approximately matched to. 
the sample of litigation specialists based on years of experience, position level and gender. 
Participants in both practice areas were identified and selected with the direct assistance of 
administrative personnel. 

In total, 225 subjects participated in this study with 207 individuals providing usable 
responses. Of these professional accountants, 101 were in litigation support and 106 were in 
auditing (110 from firm one and 97 from firm two). Table 1 reports the distribution of litigation 
specialists and auditors within the two participating public accounting firms by area of practice 
and position level, and corresponding mean values or proportions for demographic variables 
including age, gender, domain-specific experience, education level and professional certification. 


— 
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TABLE 1 


Samples of Litigation Specialists and Auditors by CPA Firm, Gender, Position Level, 
Age, Experience, Educational Level and Professional Certification 








Litigation Specialists Auditors The Combined Samples 
Firm 1 Firm2 Overall Firm1  Firm2 Overall Firm1  Firm2 Overall 
Men 32 28 60 30 30 60 63 57 120 
Women 23 18 41 25 21 46 47 40 87 
Totals 55 46 101 55 51 106 110 97 207 
Staff 3 4 7 4 5 9 7 9 16 
Seniors 26 20 46 24 21 45 50 41 91 
Managers 21 17 38 23 20 43 44 37 81 
Partners 5 5 10 4 5 9 9 10 19 
Totals 55 46 101 55 51 106 110 97 207 
Age? 28.00 2901 28.48 012701 27.88 27.43 27.52 2842 27.94 


Experience™ 6.55 7.03 6.77 5.73 6.30 6.01 6.14 6.65 6.38 
Education? 16.69 1699 1683 1664 1651 1658 16.67 16.73 16.69 
Certification? .95 91 .93 .93 .92 .92 .94 .91 .93 


€ The number represents an average value in terms of years. 
€ Certification is a proportion. This category represents the percentage of individuals who hold one or more professional 
certifications (such as the CPA, CMA, CIA and CÀ). 


Because experience may not be a completely reliable surrogate for expertise in certain 
situations (Davis and Solomon 1989), litigation support specialists in both firms completed a 
. debriefing questionnaire that attempted to capture their domain-specific experience in the 
litigation support field. Itern analysis was performed using a nonparametric test of between- 
sample equivalence (Hollander and Wolfe 1973, 71), which revealed statistically significant 
differences at the p < .05 level for the combined survey medians by position level of the litigation 
specialist—confirming that managers and partners had more experience in assessing damages 
and in providing expert testimony than senior or staff level personnel. Hence, position level was 
used to stratify litigation specialists into high (48 partners/managers) and low (53 senior/staff) 
experience subgroups. Using the same debriefing questionnaire for auditors revealed no discern- 
ible difference for any survey item by position level. For purposes of consistency, however, 
auditors were stratified by position into high (52 partners/managers) and low (54 senior/staff) 
experience subgroups. 

Gender, age and education were analyzed to determine whether they were associated with 
. subjects’ responses to the experiment. Using the Pearson chi-squared test (Feinberg 1983, 40), 
these variables did not produce significant differences in the assignment of individuals to 
treatments. Individuals in both participating firms were also compared in terms of responses to 
the experimental task, showing that between-firm differences were not significant. For purposes 
of analysis, the samples of litigation specialists and auditors from the two public accounting firms 
were combined. 
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Defining Issues Test 


The complete six story version of the Defining Issues Test was used to measure each subject’ s 
level of ethical reasoning (Rest 1979b). It is a widely used and reliable psychometric instrument 
and provides a surrogate measure of an individual’s level of ethical reasoning according to the 
cognitive developmental theories posited by Kohlberg (1969) and Rest (1979a). The DIT is a self- 
administered questionnaire that includes a series of six hypothetical conflicts. For each conflict 
or dilemma, subjects are required to select and rank order those issues that have, in their opinion, 
the most significant influence on its resolution. 

The DIT provides several responses representing the most typical modes of thinking in terms 
of Kohlberg’s (1969) and Rest’s (1986) theories. In scoring the questionnaire, points are assigned 
to each subject’s responses using a scale of four points for the most important to one point for the 
least important. The points corresponding to the highest modes of reasoning are used to construct 
a single measure known as the “P” (principled) score, which measures the percentage of post- 
conventional responses made by an individual subject to the entire instrument. Therefore, results 
are expressed as a continuum from 0.0 to .95. Since the stage-sequence model is developmental 
and sequential, a higher P score also indicates a lower percentage of pre-conventional and 
conventional responses.$ 

The DIT was completed by all participating subjects after the completion of the experiment 
and debriefing questionnaire. The Pearson chi-square test (Feinberg 1983, 40) revealed no 
significant differences in the distribution of subjects by firm or treatment according to their DIT 
P score. Table 2 reports the DIT P score means, medians, standard deviations and ranges for all 
subjects by area of practice, firm and experience subgroups.’ 


Experimental Methods 


Subjects completed an experimental task based on the computation of a physical inventory 
that was destroyed in a warehouse fire (on April 3, 1991) by using the gross profit method. The 
experimental materials included the description ofthe inventory, background on the business firm 
incurring damages and the nature of litigation. With the assistance of the director of litigation 
support services from one of the participating firms, 27 actual lawsuits involving damage 
calculations and expert testimony by accounting firm personnel were carefully reviewed by the 
researcher, a graduate assistant and two senior auditors in the litigation support department on a 
case-by-case basis.* The scenario finally chosen by the group was based on a litigation support 
engagement that was recently completed by the firm. 

This particular lawsuit was selected for three reasons. First, the computation of damages in 
the case was relatively easy to complete and did not depend on the litigation experience of the 
individual performing the task. Second, the actual lawsuit, based on a disputed insurance value 
of lost assets, was relatively straightforward and did not involve the possibility of fraud. Third, 


6 Because the DIT is scored objectively, statistical reliability and validity can be fairly assessed. Test-retest reliability for 
the “P” score is generally in the high .70s or .80s. Cronbach’s alpha index of consistency is generally in the high .70's 
(see Rest 1986, 11). According to Rest (1979a), the DIT has been validated in a number of ways including predictive 
validity, face validity, criterion group yan longitudinal change studies, convergent-divergent validity, experimen- 
tal enhancement studies and discriminate vali 

f As can be seen in table 2, the mean and median DIT P score for both firms and both areas of practice are approximately 
the same. In addition, managers and partners have lower overall DIT P scores than staff and seniors. These results are 
completely consistent with the finding of an inverse relationship between DIT P scores and experience found in earlier 
accounting ethics studies as summarized by Gaa (1993) and Ponemon and Gabhart (1994). 

$ The 14 items included as the debriefing task were based on open-ended interviews with one partner and four managers 
in the litigation support department of one participating CPA firm. 
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TABLE 2 
DIT P Score Means, Medians, Standard Deviations and Ranges for Litigation Specialists 
| and Auditors by CPA Firm and Experience Subgroups 


Litigation Support Auditing 

Firm 1 Firm 2 Firm 1 Firm 2 

Low Experience (n) 29 24 28 26 
Mean 40.89 40.08 40.61 40.52 
Median 39.30 40.00 38.60 40.10 
St. Dev 13.79 13.61 14.13 14.08 
Range 54.30 50.10 49.00 57.80 

High Experience (n) 26 22 27 25 
Mean 37.71 38.02 38.21 37.55 
Median 36.90 37.00 37.30 36.90 
St. Dev 10.94 11.24 12.10 11.90 
Range 43.10 48.00 51.50 46.50 

Overall Sample (n) 55 46 55 51 
Mean 39.39 39.09 39.43 39.06 
Median 38.60 38.10 38.00 37.50 
St. Dev 14.06 13.43 14.12 13.99 
Range 54.30 50.10 51.50 58.00 


Note: 


The above results are based on the full six story version of the Defining Issues Test (DIT) developed by James Rest (197%). 
DIT results were computer scanned. All missing or incomplete items were manually reconciled or deleted from the sample. 


the historic gross profit percentages used in the computation of inventory varied during the 
relevant period of analysis and, therefore, gave individuals considerable latitude in estimating 
damages. 

To maintain strict confidentiality, references to the actual lawsuit or litigation support 
engagement were deleted from experimental and debriefing materials. The following three 
paragraphs, read by all subjects, provide the background information to the experimental task. 


On April 3, 1991 a fire destroyed the entire merchandise inventory on hand of 
Johnston Wholesalers and Distributors, Inc (TWD), a distributor of truck accessories and 
farm equipment. The company is a medium sized business firm located in Albany, New 
York with one primary warehouse location, and sells to truck and tractor dealerships 
throughout North America. 

JWD's physical inventory, valued on an average cost basis, 1s fully insured by a 
policy based on the asset valuation or cost at the time of the fire. Although JWD did not 
track quantities on-hand or keep perpetual records for its inventory assets, the com- 
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pany’s controller was able to provide financial information for computing damages 
using the gross profit method to value the inventory loss. 

Twelve months of gross profit percentages as well as yearly averages were also 
provided by the JWD controller to the insurance company. Changes to JWD's gross 
profit were based on normal seasonal variation, especially during the Fall season, and the 
presence of significant competition from a large Korean firm who entered the U.S. and 
Canadian markets during the first quarter of 1991. 


In addition, subjects were given quantitative data shown in figure 1, for purposes of computing 
accounting damages. To avoid contrast effects between experimental manipulations (see, for 
instance, Joyce and Biddle 1981; Pany and Reckers 1987), the present study on litigation support 
judgment employed a between-subjects research design. l 

Each experimental instrument contained (1) the background description of the problem and 
quantitative data (as shown above), (2) one of two experimental treatments (or a control group) 
concerning the legal position of the client who retains the expert, (3) a brief schema that illustrated 
how to compute a missing inventory value using the gross profit method, and (4) a debriefing 
questionnaire to elicit each individual’s perception of the experimental task and, more generally, 
his or her actual experience in providing litigation support services and expert testimony for client 
organizations. 


Treatments and Dependent Variable 


Individuals were randomly assigned to one of two between-subject treatment groups or a 
control group. The first treatment group was told that they represented a business firm (the 
plaintiff) who sued its insurance company for the “fair value” of insured inventory assets. 
Members of the second treatment group were told that they represented the insurance company 
(the defendant) as an accounting expert to determine the “fair value" of WD's lost inventory. The 
experimental treatments received by subjects were stated as follows.? 


Plaintiff. The insurance company and JWD management disagree about the settlement 
amount determined by the insurer’s claims adjustment department. As a result, JWD 
recently sued the insurance company for breach of contract. You have been hired by JWD 
as an accounting expert to provide an estimated valuation of JWD's merchandise 
inventory on April 3, 1991 and to attest to this value in the opposing attorney's 
deposition. 


Defendant: The insurance company and JWD management disagree about the settlement 
amount determined by the insurer's claims adjustment department. As a result, IWD 
recently sued the insurance company for breach of contract. You have been hired by the 
insurance company as an accounting expert to provide an estimated valuation of JWD's 
merchandise inventory on April 3, 1991 and to attest to this value in the opposing 
attorney's deposition. 


The remaining subjects were randomly assigned to a control group where experimental materials 
stated that the accounting expert is appointed by the court. For individuals in this group, the legal 


? Basedon discussions with litigation support personnel in firm one, it was decided that separating the dependent variable 
into two components—namely, computation and attestation—would be entirely unrealistic, As stated by onc litigation 
support partner, “All damage computations must be able to stand the test of the courtroom.” 
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FIGURE 1 
Quantitative data used in accounting computations 





Gross Sales 1/1/91 to 4/3/91 $55,250,341 . 

Physical Inventory on 1/1/91 $10,001,050 

Freight-In, 1/1/91 to 4/3/91 $974,010 

Merchandise purchases, 1/1/91 to 

4/3/91 (including 2,432,010 of goods 

in transit on 4/3/91, shipped f.o.b. 

shipping point from the vendor) $37,792,093 

Purchase returns $1,889,032 

Gross* Gross* 
Month Profit % Month Profit % 

04/90 33% 10/90 35% 
05/90 32% 1190 39% 
06/90 31% 1290 38% 
0790 31% 01/91 29% 
08/90 31% 02/9] 27% 
09/90 32% 03/91 26% 


Average gross profit for first quarter, 1990 = 35.2% 
Average gross profit for calender year, 1990 = 34.0% 
Average gross profit for first quarter, 1991 = 27.3% 
Average gross profit for the above 12 months = 32.0% 
Industry gross profit for calender year 1990 = 25.0% 


* The gross profit percentage equals the gross margin divided by total sales in a given monthly time period. 


position of the expert witness is neither stated nor implied, and responses should not be influenced 
by client advocacy considerations. Specifically, all members of the control group received the 
following information. 


Control: The insurance company and JWD management disagree about the settlement 
amount determined by the insurer's claims adjustment department. As a result, JWD 
recently sued the insurance company for breach of contract. You have been appointed 
by the court as an accounting expert to provide an estimated valuation of JWD’s 
merchandise inventory on April 3, 1991 and to attest to this value in the attorney's 
deposition. 


After reading the treatment, individuals were asked to compile the value of missing 
_ inventory, which served as the primary dependent variable of the study, using the quantitative data 
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provided by employing the following simple formulation: Missing Inventory = Beginning 
Inventory + Purchases - E(COGS), where E(COGS) = [(1 — Gross Profit Percent) x Gross Sales]. 
After estimating JWD’s inventory value, individuals were required to express their confidence 
in the accounting estimate provided by them using a 100 point continuous scale from “0” denoting 
no confidence to “100” denoting complete confidence. This measure served as a reliability 
check.!? 

A manipulation check of the primary independent variable was performed two ways. First, 
by employing a pilot sample of 22 second year graduate accounting students, both the plaintiff 
and defendant conditions were found to be salient (see Carmines and Zeller 1979). Second, the 
reliability of responses to the experimental task was determined by matching the estimated value 
of inventory to one of 52 possible inventory values that could be obtained from the quantitative 
data provided (rounded up to three digits)—ranging from $2,949,367 to $14,090,434 (depending 
on the gross profit percentage used and assumptions made about in-transit inventory). In total, 18 
subjects did not meet the matching criteria or did not properly complete the DIT and were 
excluded from the analysis. The range of confidence in inventory estimates ranged from a low of 
20 percent to a high of 100 percent and did not depend on the experimental condition.!! 


IV. RESULTS 


The means, medians and standard deviations of inventory estimates for between-subject 
treatments, experience subgroups and DIT levels are provided in table 3 for litigation specialists 
and in table 4 for auditors.'* According to the primary hypothesis (H), litigation specialists are 
expected to provide a lower inventory value when they are employed by the defendant (insurance 
company) and a higher inventory value when they are hired by the plaintiff (JW D). The inventory 
estimates provided by accountants in the control group are used as the benchmark for unbiased 
or objective estimation. 

The statistical significance of between-group differences are reported in table 3 using the 
Mann-Whitney U test (Hollander and Wolfe 1973, 71). The Binomial test (Hollander and Wolfe 
1973, 15) is also used to determine the power (1) of specific between-group comparisons. As 
predicted by H,, litigation specialists in the plaintiff treatment provided significantly higher 
inventory estimates than the control group ($7,369,395 2 $6,520,641, p < .01 and x 2.96). Also 
in support of H,, individuals in the defendant treatment provided significantly lower inventory 
estimates than the control group ($5,933,881 < $6,520,641, p < .05 and x ».91).? 

Table 3 shows a pattern of interesting—but somewhat unexpected—results in terms of the - 
influence of ethical reasoning and experience on accountants' objectivity. Significant between- 


Since the gross profit method is an inherently unreliable approach to determining ending inventory values (Chasteen 
et al. 1989, 467), confidence in the inventory estimates was not expected to be high. Such a method, however, has 
produced successful expert testimony for accounting professionals who served the client as a litigation support 
specialist.—see, for example, Electro Services, Inc. v. Exide Corporation [847 F.2d 1534 (11th Cir. 1988)] and Midland 
Hotel Corporation v. H. Donnelley Corporation [118 Ill. 2d 318]. 

11 Only six of the 101 litigation specialists and nine of the 106 auditors provided confidencs levels that were less than 50 


percent. 

To ensure adequate power for statistical tests concerning the direct and indirect effect of ethical reasoning on 
accountants' inventory estimates, litigation specialists and auditors were grouped into high and low DIT P score 
subgroups. The median DIT P score from each sample of litigation specialists (DIT P=38.5) and auditors (DIT P=38) 
was used for splitting the two samples. The rationale for this procedure is explained by Ponemon and Gabhart (1990) 
and Arnold and Ponemon (1991). It is also important to note that using percentage stage scores from the DIT, it was 
revealed that subjects in the high DIT groups possessed primarily mape e Ed reasoning skills, while subjects 
in the low DIT groups possessed primarily stage 2 and stage 3 reasoning skills 

P Given that the range of possible inventory estimates that could have been compiled from experimental data was 2.9 to 
14 million dollars, inventory values in all groups tended to be relatively conservative since most estimates fell below 
the median estimate of approximately 8 million dollars. 
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TABLE 3 


The Means, Medians and Standard Deviations for Litigation Specialists’ Inventory 
Estimates by Treatments, Experience Subgroups and DIT Levels 





Defendant Plaintiff | Control 
Low Experience 
Low DIT (n) 6 9 7 
Mean $5,653,654 $7,619,577 $6,585,429 
Median 5,381,377 7,626,145* 6,652,135 
St. Dev. 1,928,517 1,734,617 2,017,968 
High DIT (n) 10 li 10 
Mean $5,796,700 $7,623,133 $6,550,626 
Median 5,933,881" 7,369,395 6,816,891 
St. Dev. 1,669,989 1,934,056 1,848,625 
Overall (n) 16 20 17 
Mean $5,743,057 $7,621,533 $6,564,956 
Median 5,774,761" 7,138,643? 6,652,135 
St. Dev 1,682,005 1,907,503 1,913,755 
High Experience 
Low DIT (n) - 11 9 8 
Mean $5,983,483 $7,239,978 $6,575,900 
Median 5,657,629" 7,369,395 6,652,891 
St. Dev. 1,935,704 1,870,045 1,971,576 
High DIT (n) 7 6 ` 7 
Mean $6,598,164 $6,713,263 $6,687,022 
Median 6,652,135 6,662,391 6,520,641 
St. Dev. 2,115,581 1,813,645 1,779,122 
Overall (n) 18 15 15 
Mean $6,222,526 $7,029,292 $6,627,757 
Median 6,355,387 6,669,015 6,520,641 
St. Dev 1,560,063 1,705,583 1,974,805 
Overall Sample 
Low DIT (n) 17 18 15 
Mean $5,867,073 $7,429,778 $6,580,347 
Median 5,657,629" 7,553,145" 6,907,891 
St. Dev. 1,892,111 1,905,583 2,061,805 
High DIT (n) 17 17 17 
Mean $6,126,715 $7,302,002 $6,606,789 
Median 6,355,387 6,907,891 6,520,641 
St. Dev. 1,892,785 1,803,861 1,913,870 
Overall (n) 34 35 32 
Mean $5,096,894 $7,367,715 $6,594,394 
Median 5.933,881' 7,369,395" 6,520,641 
St. Dev. 2,012,189 1,906,883 2,004,280 





“The median difference between the defendant and control group is significant at or beyond the .05 level using the Mann- 
Whitney U test. 

* The median difference between the plaintiff and control group is significant at or beyond the .05 level using the Mann- 
Whitney U test. 


480 The Accounting Review, July 1995 


subject differences are found for low experienced individuals in the plaintiff and defendant 
treatments in comparison to those in the control group, regardless of DIT P score. In sharp 
contrast, between-subject differences are not significant for all individuals in the high experience 
subgroup. Table 3 reveals that the DIT P score levels explain marked differences between high 
and low experience subgroups, where managers and partners with relatively high DIT P scores 
in the plaintiff and defendant treatments tended to provide inventory estimates that are less biased - 
(i.e., closer to the control group's estimate) than all other subgroups studied. 

Taken together, these findings partially support the first subsidiary hypothesis (H), which 
concerns the influence of ethical reasoning on bias in inventory estimation, in light of the finding 
that managers and partners with relatively high DITs were less influenced by the plaintiff or 
defendant manipulations than those with lower DITs. The second subsidiary hypothesis (H), 
however, is not supported because domain-specific experience moderates, rather than exacer- 
bates, the influence of the plaintiff and defendant treatment on bias in the estimation of JWD's 
inventory provided by litigation professionals. 

The means and medians reported in table 4 reveal a different story for auditors in two 
important respects. First, while the pattern of results is consistent with table 3 (i.e., plaintiff 2 
control 2 defendant), both the defendant and plaintiff manipulations were much less significant 
for auditors than litigation specialists. In addition, while the overall difference between the 
plaintiff and control group is significant based on the Mann-Whitney U test and Binomial 
procedure ($7,138,643 2 $6,652,135, p < .01 and x > .88), the overall effect for the defendant 
manipulation is not significant ($6,466,605 < $6,652,135, p = .23 and m > .73). Second, and 
perhaps more importantly, the pattern of results shown in table 4 does not indicate any relationship 
whatsoever between the objectivity manipulations and experience, DIT P scores or the configu- 
ration of experience and DIT P scores. Hence, these findings do not support the first or second 
subsidiary hypotheses for the present sample of auditors. 

Two multivariate, fixed effect analysis of variance models were used to analyze the rather 
complex relationships among inventory estimates (dependent variable), experimental treatments, 
experience levels and DIT levels for litigation specialists and auditors. To control for unequal cell 
sizes in each model, the method of unweighted cell means (Neter et al. 1985, 753) was employed. 
Table 5 reports the ANOVA results separately for litigation specialists (panel A) and auditors 
(panel B).'* The results in panel A clearly show that the main effects for treatment (p < .001) and 
experience (p < .01) are statistically significant, while DIT is marginally significant (p < .10). The 
two-way interactions for treatment x experience (p « .01) and experience x DIT level (p « .001) 
are also significant. The only insignificant interactions in panel À concern treatment x DIT level 
and treatment x experience x DIT level. Although not surprising in light of the findings reported 
in table 4, the ANOVA results for auditors are much less significant. The full model in panel B 
shows that only treatment is significant (p « .01), while DIT is marginally significant (p « .10). 
All other main effects and interactions are insignificant. 


Debriefing 
After completing the experimental task, all litigation specialists and auditors completed a two 
part debriefing questionnaire. The first part of the instrument required individuals to assign ranks 


to factors that were derived from open-ended interviews with litigation support specialists in firm 
one. The purpose of this task was to determine those attributes deemed most important to each 


“The Komologorov-Smirnov test of normality and homoscedasticity revealed no significant violations in ANOVA 
residuals (Hollander and Wolfe, 1973, 219). 
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TABLE 4 
The Means, Medians and Standard Deviations for Auditors’ Inventory Estimates 
by Treatments, Experience Subgroups and DIT Levels by Treatments, Experience 





Subgroups and DIT Levels 
Defendant Plaintiff Control 
Low Experience 
Low DIT (n) 8 8 7 
Mean $6,513,443 $7,619,577 $6,469,358 
Median 6,340,728 7,626,145" 6,652,135 
St. Dev. 1,995,622 1,734,617 2,017,968 
High DIT (n) 11 11 9 
Mean $6,412,622 $7,015,458 $6,501,314 
Median 6,177,694 7,138,643 > 6,355,387 
St. Dev. 1,917,188 1,771,106 1,837,230 
Overall (n) . 19 19 . 16 
Mean $6,455,073 $7,269,823 $6,487,333 
Median | 6,355,387 7,138,643 > 6,503,761 
St. Dev. 2,047,311 1,900,960 1,837,230 
High Experience 
Low DIT (n) 10 9 10 
Mean $6,184,410 $6,993,681 $6,459,731 
Median 6,144,634" 7,369,395° 6,652,135 
St. Dev. 1,814,682 1,927,498 1,671,159 
High DIT (n) 8 7 8 
Mean $6,396,470 $7,007,065 $6,491,783 
Median 6,340,728 7,138,643 ° 6,585,152 
St. Dev. 2,089,634 1,820,033 1,671,159 
Overall (n) 18 16 18 
Mean . $6,278,659 $6,999,537 $6,473,976 
Median 6,340,728 7,138,643° 6,503,761 
St. Dev. 1,921,189 1,814,046 1,735,260- 
Overall Sample 
Low DIT (n) 18 17 17 
Mean $6,330,647 $7,288,220 $6,463,695 
Median 6,466,605 7,626,145° 6,652,135 
St. Dev. 1,905,172 1,850,565 1,708,005 
High DIT (n) 19 18 17 
Mean $6,405,821 $7,012,194 $6,496,829 
Median 6,348,057 7,138,643° 6,355,387 
St. Dev. 2,030,411 1,793,569 1,846,581 
Overall (n) 37 35 34 
Mean $6,369,250 $7,146,263 $6,480,262 
Median 6,466,605 7,138,643 > 6,652,135 
St. Dev. 1,984,250 1,867,506 1,886,246 





* The median difference between the defendant and contro! group is significant at or beyond the .05 level using the Mann- 
Whitney U test. 

> The median difference between the plaintiff and control group is significant at or beyond the .05 level using the Mann- 
Whitney U test. 
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TABLE 5 
Multivariate Analysis of Variance Models on Accounting Estimates by Treatments, 
Experience Subgroups and DIT Levels. 


Panel A: Litigation Specialists 








Degrees of Significance 

Effects Freedom Level 
Treatment [defendant, plaintiff, control] 2 p «.001 
Experience [high, low] ® 1 p< Ol 
DIT level [high, low] © I p< .10 
Treatment x Experience 2 p< 0l 

Treatment x DIT level 2 p is ms. 

Experience x DIT level 2 p< .001 
Treatment x Experience x DIT level 2 p is n.s. 
F[12,101] = 4.223, p < .001, R? = .632 
Panel B: Auditors 
Degrees of Significance 

Effects Freedom Level 
Treatment [defendant, plaintiff, control] 2 p< 0l 

Experience [high, low] 9 1 p is n.s. 
DIT level (high, low] © 1 p< 10 
Treatment x Experience 2 p is ms. 

. Treatment x DIT level 2 p isms. 

Experience x DIT level 2 p is n.s. 

2 p is ms, 


Treatment x Experience x DIT level 


F[12,106] = 1.843, p < .064, R? = .257 


Notes: 

* Individuals were split into two experience subgroups based on their position level in the CPA firm. Staff and seniors 
represent low experience, and managers and partners represent high experience. 

> The median DIT P score was used to split sample into high and low groups. 
The method of unweighted means was used to adjust for unequal cell sizes in ANOVA computations (Neter et al. 1985, 
753). 


subject's underlying judgment on the experimental task. Fourteen attributes relating to the 
litigation support exercise were rank ordered by assigning “1” through “14,” from the most 
important to the least important factor. Differences in ranking by experimental treatments (not 
shown here) were small and relatively constant. Modest differences did exist, however, between 
high and low experience groups and between high and low DIT levels. 

Table 6 reports the median ranks for each of the 14 attributes for each sample. Six differences 
were found to be statistically significant at or beyond thc .05 level using the Wilcoxon Signed 
Rank test (Hollander and Wolfe 1973, 33). Specifically, litigation specialists provided a higher 
rank to: The CPA should advocate the most advantageous position for the client, The CPA should 
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TABLE 6 
Median Rank Ordering for 14 Debriefing Survey Attributes for Litigation 
Specialists and Auditors 
[n=101] 
Attributes influencing the estimation of JWD's Litigation [n=106] 
damages for lost inventory Specialists Auditors 
1. JWD's controller prepared the financial 10 7 p «05 
information used in computations. 
2. The gross profit percentage sharply declined 1 3.5 
in 1991. 
3. New competition by a Korean firm might have 12 12 
permanently reduce JWD’s gross profit. 
4. The inventory valuation for April 1990 was 2 2 
higher than the yearly average. 
5. The CPA should advocate the most advantageous 3 9 p<.0l 
position for the client. 
6. The CPA should maintain conservatism in all 9 14 p<.0l 
accounting estimates. 
7. The CPA should maintain a level of 5.5 1 p<.05 
objectivity in litigation support work. 
8. The CPA should strive to win the case for the no 13  p«.01 
client organization. 
9. The CPA should avoid conflicts-of-interest. 4 . 35 
10. The CPA doing litigation support work should 8 6 
be concerned about malpractice litigation. 
11. JWD's trend in the gross profit percent 7 5 
changed direction during the year. 
12. JWD is in a high risk industry. 13.5 8 p<.0l 
13. JWD's fire was suspicious, perhaps caused by 13.5 11 
arson. 
14. The insurance company will most likely settle ll 10 
rather than taking this case to court. 
Note: 


Significance levels between litigation specialists and auditors on the median rank for individual items were determined using 
the Wilcoxon Signed Rank test (Hollander and Wolfe 1973, 33). 


maintain conservatism in all accounting estimates and The CPA should strive to win the case for 
the client organization. Auditors provided a significantly higher rank to: The CPA should 
maintain a level of objectivity in litigation support work, JWD is in a high risk industry and 
JWD's controller prepared the financial information used in computations. 

Auditors assigned the highest rank to the only survey item concerning objectivity, while 
litigation specialists provided much higher ratings than auditors to the two items pertaining to 
client advocacy. Even though litigation specialists at all experience and DIT levels provide a 
relatively high rank to objectivity on the debriefing questionnaire, between-sample differences 
corroborate experimental results by suggesting that accountants who provide litigation support 
services may place greater emphasis on client advocacy considerations than objectivity. 


484 The Accounting Review, July 1995 


Part two of the debriefing questionnaire captured individual perceptions about the experi- 
mental task and, more generally, about actual experiences in providing litigation support services 
for the firm's clients. Findings corroborate task realism for litigation specialists—83 accountants 
(82 percent) believed that the task was realistic, and six subjects (six percent) felt that it was 
contrived. In addition, 88 litigation specialists (87 percent) reported that the task was comparable 
to a task that they bad actually completed during their tenure in the litigation support field. Of 
these, 68 individuals (67 percent) said they had used the gross profit method to assess à client's 
damages. Subjects were also asked about their ability to estimate or make inferences about 
damages in a lawsuit. Of the 48 participating managers and partners, 40 reported that they 
possessed above average skills, while the remaining eight reported that they possessed average 
ability in this area. Of the 53 senior or staff level accountants, 37 said that they possessed above 
average skills, 15 said they possessed average abilities and one person did not respond. 

Perceptions for auditors were similiar to litigation specialists in most respects. While 85 
auditors (80 percent) believed that the task was realistic, only 47 auditors (44 percent) felt that the 
task was comparable to something that they had performed during their auditing career. Of these 
individuals, 41 auditors (39 percent) said they had used the gross profit method to assess or 
validate a client's inventory value. Auditors also reported average or above average abilities in 
estimating missing inventory using the gross profit method. Of the 52 participating managers and 
partners, 43 reported that they possessed above average skills, while the remaining nine reported 
that they possessed average ability in this area. Of the 54 senior or staff level auditors, 35 said that 
they possessed above average skills, 15 said they possessed average abilities and four felt they 
did not have the requisite skill. 


V. DISCUSSION AND CONCLUSION 


These results provide evidence that the judgments of professional accountants who practice 
litigation support services are sensitive to the legal position of the client or attorney who employs 
them. The findings for auditors suggest that neither experience nor ethical reasoning level seems 
to influence the objectivity of their inventory estimates. One reason for between-sample 
differences is that experience may not be a good surrogate for expertise (see Davis and Solomon 
1989), and auditors were unable to relate their experience to an experiment based on a litigation 
support situation. 

Perhaps the interaction between domain-specific experience and DIT P scores for litigation 
specialists indicates that the moral expertise paradigm as advanced by Gaa (1993) and Gaa and 
Ponemon (1994) explains the accountants' sensitivity to objectivity in the present experiment. 
Basedonthis theory, one would predictthat "superior" ethical judgments in a professional context 
(such as the litigation support professional’s ability to uphold objectivity in the face of economic 
pressures) require the practitioner to possess a sufficient level of relevant experience and ethical 
reasoning skills before he or she can adequately frame and resolve ethical conflict. 

The overall findings of this research are consistent with earlier studies on the existence of bias 
in the testimony provided by expert witnesses. Like Otto’s (1989) study of clinical psychology 
students, the present research shows that accountants favor their client's economic interests. 
Unlike Otto (1989), who looked only at novices (doctoral students), the present work captured 
the judgments of highly experienced individuals who work in the litigation support field and 
compared the judgments of these individuals to a related group of professionals who do not 
practice litigation services. 

The implications of this research are twofold. First, given that the litigation specialist’s role 
requires adherence to Rule 102 of the Code (AICPA 1989), the bias in accounting estimates 
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provided by litigation specialists in this experiment may indicate a tendency for some individuals 
to subordinate their judgments. This issue is important because services that conflict, either in fact 
or in appearance, with the CPA’s objectivity in one area of practice could diminish the public’s 
positive perceptions of quality in all facets of accounting. In addition, as aptly noted by Crain et 
al. (1994), litigation support services are not immune to the rash of malpractice claims which also 
threatens the public accounting profession." 

The second implication of this research is that experience coupled with ethical reasoning may 
enhance the objectivity of litigation support judgment, thus supporting Gaa's (1993) moral 
expertise paradigm. This finding is also consistent with Zonana's (1984) work in thatexperienced 
accountants may think that too much allegiance to one side or the other in any given lawsuit may 
reduce the credibility of expert testimony in the eyes of the judge and jury. In support of this claim, 
the results for both litigation specialists and auditors were extremely conservative because their 
accounting estimates were almost always below the median estimate that could have been 
computed from the experimental materials provided. Along these lines, one partner in the 
litigation support field noted in a closing interview that, "Experience breeds prudence and 
prudence demands a relatively conservative, but winnable, position when rendering [expert] 
testimony." 

The following limitations should be considered carefully before seeking to generalize the 
present findings to accountants' objectivity in the practice of litigation support. That is, a sample 
of litigation support and auditing personnel from two large public accounting firms limits the 
ability to extend results to the population of all accountants and auditors in the practice of litigation 
support in both large and small firms. Notwithstanding the care taken to ensure external validity 
within the experiment, it is important to note that the actual judgments of litigation specialists 
were never observed. 

Despite these limitations, the results of this research provide support for the primary 
hypothesis advanced earlier. More importantly, it suggests that moral expertise may play an 
important role in the way litigation support specialists in public accounting firms frame and attend 
to client advocacy considerations. Although experimental findings concerning the configuration 
of domain-specific experience and ethical reasoning levels on professional judgment are 
tentative, results ofthis study indicate three researchable issues that deserve further consideration. 

While the expertise literature in psychology suggests that members of a professional group 
such as litigation specialists and auditors in public accounting firms can learn how to make 
"superior" judgments, the findings of this research imply that their expertise is incomplete 

without the ethical reasoning skills to comprehend ethical conflict inherent in the professional 
role. Since the study of expertise has tended to focus almost exclusively on technical judgment 
(see Ericsson and Smith 1991 for an overview of this area of research), results of this study 
demonstrate that a task characteristic relating to an ethical judgment (such as objectivity) can be 
studied in a rigorous and scientific fashion. 

Another related issue concerns the findings of recent ethical reasoning studies in accounting 
and auditing (Ponemon and Gabhart 1993, 1994), which show that CPAs are not reaching their 
capacity to make post-conventional or principled ethical judgment. Implicit in this work is the 
notion that higher ethical reasoning levels should result in heightened ethical sensitivity and 


i5 In spite of significant between-subject differences found in this paper, it is important to emphasize the fact that litigation 
specialists and auditors provided relatively conservative accounting estimates in this study (see footnote 13). Thus, it 

_ is possible that the inherent tendency toward client advccacy as found herein would be of little consequence to the public 
as long as estimates fall within a reasonable range. 
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improved problem solving abilities for individuals in their professional practice. Results of this 
study provide a more complex story, showing that a sufficient level of domain-specific experience 
may be a requisite to ethical reasoning in a professional domain such as litigation support. 

While the present study deals with litigation support judgments, the issue of professional 
objectivity is important in other aspects of public accounting practice. For instance, CPAs who 
prepare income taxes for clients are required to uphold professional objectivity, but at times their 
role demands that they advocate an aggressive tax position for their clients. As observed by 
Johnson (1993) and Cuccia et al. (1995), the issue of client advocacy may create ethical conflict 
for CPAs, especially when the tax issue is based on vague standards or requires discretion and 
latitude in judgment. The above issues and areas of study should prove to be very useful in 
determining whether objectivity in the litigation support field is situation specific or is predicated 
upon the ethical orientation of the accounting firm or the moral expertise of the individual 
practitioner. 
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L INTRODUCTION 


rior research seeking to explain cross-sectional variation in accounting met 
p uncovered certain empirical regularities—firms' choices appear to be assoc 

leverage, size, and the existence of bonus plans (Holthausen and Leftwich 1' 
Zimmerman 1986, 1990; Christie 1990). These results are generally interprete: 
evidence of managers' opportunistic behavior because the accounting choices are« 
reducing the probability of debt covenant violations and increasing managerial | 
However, Watts and Zimmerman (1990) question this interpretation on the gr 
documented associations might in fact reflect the correlation between firms' inve 
tunity sets, financial policies, and their efficient set of accounting methods. Emp 
research has shifted to providing evidence on these relationships (Smith and Wat 
and Gaver 1993; Skinner 1993), although distinguishing between opportunism ar 
difficult both theoretically and empirically (Smith 1993). 

This study extends accounting method choice research to the issue of 1 
allocation. Specifically, this study examines whether firms' choice between com] 
partial income tax allocation is related to the traditional explanations of manageri: 
and political visibility after controlling for their economic expectations and invest 
nity sets. Juxtaposing firms' economic expectations and investment opportunit 
contracting and political cost variables permits one to simultaneously addres: 
efficiency and opportunism as they relate to accounting method choice. In addit 
examines the role of external auditors in firms’ accounting method choice. ] 
association between auditors’ stated positions and their client firms’ choices intr 
stakeholder likely to be involved in that decision in addition to the parties generz 
in the prior literature (e.g., bondholders, stockholders, and managers). 

_ The choice between comprehensive and partial allocation is examined 
Domestic International Sales Corporations (DISCs), a special type of corporatic 
1971 to stimulate exports. DISCs provide a unique database to examine this choic 
were typically organized as subsidiaries of U.S. corporations and were statutor 
indefinite deferral of U.S. income tax on a portion of their export profits. These 
qualified DISCS to use the "indefinite reversal criterion" exception to the general: 
comprehensive allocation of income taxes under generally accepted account 
(GAAP).! Based on that exception, some firms in fact did not accrue taxes for fina 
purposes on the deferred DISC earnings (partial allocators), whereas others did 
(comprehensive allocators). 

The test results, based on a sample of 320 firms with a DISC operational in 19 
support for the managerial opportunism perspective but only partial support for 
based perspective on accounting method choice. Specifically, firms with higher | 
interest coverage ratios, and low profitability were found to be more likely to choc 
increasing partial allocation method. In addition, firms with higher effective 
observed to be less likely to use partial allocation. With regard to the efficiency-ba: 


! Under Accounting Principles Board Opinion No. 11 (APB 11), the authoritative pronouncement 
income taxes applicable at that time, comprehensive allocation was required for all temporary differer 
special areas. APB 23 introduced the "indefinite reversal" criterion to allow exceptions to this requi: 
the five excluded arcas. Onc of these areas was the undistributed earnings of subsidiaries. The new s 
taxes (SFAS 109) supersedes APB 11 but retains the comprehensive allocation requirement as w 
exceptions. 
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firms with weaker cash flows were found to be more likely to use partial allocation, but firms’ 
choices were not observed to be associated with their investment opportunity sets. Together, this 
evidence is consistent with managers acting opportunistically to avoid potential debt covenant 
violations, political scrutiny, or to mask poor performance, but only partially supportive of 
managers choosing methods for efficiency reasons and to reflect their economic expectations. 
Finally, there is strong support for an association between external auditors’ stated position on the 
accounting treatment of tax allocation choice and their client firms’ method choice. 

The rest of this paper is organized as follows: section II provides the institutional background 
of DISCs; section III develops the hypotheses; section IV describes the data and empirical 
procedures used to test those hypotheses; section V presents the results; and section "1 offers 
some concluding remarks. 


II. INSTITUTIONAL BACKGROUND OF DOMESTIC INTERNATIONAL 
SALES CORPORATIONS (DISCS) 


For tax years beginning on or after January 1, 1972, U.S. taxpayers were allowed to establish 
a new type of domestic corporation called DISCS that were entitled to certain special tax benefits.” 
In contrast with regular U.S. corporations that are subject to double taxation, DISCs were 
considered nontaxable entities and their profits subjected to tax only at the shareholder-level 
when distributed or deemed distributed. Each year, a portion of the DISC’s export profits was 
deemed distributed to its shareholders, thereby attracting current U.S. tax on that income. 
However, U.S. tax could generally be deferred indefinitely on the remaining export profits. The 
extent of the deferral varied over time, ranging from 50 percent of the DISC export income to 
42.5926 ofthe DISC's incremental export income over a moving average base. These benefits were 
lost (and prior years' benefits required to be recaptured) if: (1) the indefinitely deferred DISC 
income was actually distributed; or (2) the corporation ceased to qualify as a DISC, its DISC 
'election was terminated or revoked, or it was liquidated. The key requirement to receive DISC 
status was that substantially all of the corporation's gross receipts and gross assets be export- 
related (see sections 991 to 997 of the Internal Revenue Code for the primary statutory provisions 
affecting DISCS). 

For financial reporting purposes, some firms accrued taxes on the indefinitely deferred DISC 
earnings (comprehensive allocators) despite the deferral granted by statute, whereas others did 
not (partial allocators). Relative to comprehensive allocators, partial allocators recognized a 
lower income tax expense, higher net income after taxes, higher retained earnings, and a lower 
deferred tax credit balance. 


2? The rationale for providing the special tax benefits was not only to stimulate U.S. exports, but also to remove existing 
sources of discrimination against U.S. exporters. For example, while U.S. corporations engaged in export activities were 
subject to U.S. income tax currently on their foreign earnings, U.S. corporations producing and selling abroad could 
defer U.S. income taxes on their foreign earnings as long as those earnings were retained abroad. In addition, while most 
trading partners ofthe U.S. subsidized their exporters by refunding any value added taxes paid by them and by exempting 
their foreign earnings from domestic income taxes, the U.S. did not provide exporters any direct subsidies (U.S. 
Congress 1971a, 1971b). 
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Hypotheses based on Contracting Cost Arguments 


The Debt-Covenant Hypothesis 


It has been argued for some time that conflict exists between bondholders and s 
because stockholders acting opportunistically can transfer wealth from bondholde 
selves via dividend payments, claim dilution, asset substitution, or under-investment 
Warner 1979). Agency theory suggests that debt covenants written in lending agre 
reduce this dysfunctional behavior and increase overall firm value. Because thes: 
usually are specified in terms of accounting numbers (Smith and Warner 1979), fim 
of covenant violation are hypothesized to use income-increasing accounting metho: 
those constraints. These arguments suggest that, ceteris paribus, DISC firms closer 
debt-covenant violation are more likely to use partial allocation.?^ 


The Public Debt Hypothesis 


Apart from the agency costs of debt, contracting costs also include renegotiation 
costs (Watts and Zimmerman 1990). It is believed that these costs are higher with pub! 
private debt because public debt is more widely held (Smith and Warner 1979). Hence 
relatively more public debt are more likely to be concerned about covenant violatio 
necessitate costly renegotiation, and thus more likely to adopt income-increasing 
procedures (Holthausen 1981; Leftwich 1981). This reasoning would predict t 
paribus, DISC firms with more public debt in their capital structure are more likely tı 
allocation. 

There are offsetting influences, however, that make it difficult to find an efl 
hypothesis. For example, public debt covenants have been observed to be less restricti 
because they are more costly to renegotiate), and the vast majority of covenant violat 
be of private debt agreements (Beneish and Press 1993; Chen and Wei 1993; Swee 


The Political Cost Hypothesis 


The assumption of positive information and lobbying costs made in the industri. 
tion literature suggests that the political process generates costs for firms (Watts and 2 
1986, 224—242; 1990). Because accounting numbers are used in this process, there ar 
to manage these numbers, especially firms’ reported profits. Following Watts and 2 
(1978), firm size typically is used to proxy for political costs because larger firms als 
more successful and more visible. Hence, larger firms are hypothesized to adopt incon 
accounting procedures. However, this argument has conceptual problems because 
known to be associated with many different phenomena (Ball and Foster 1982). : 


3 This hypothesis is based on several assumptions, such as breaches of debt covenants are costly and that 
as leverage adequately proxy for firms' closeness to covenant constraints (Begley 1990). However, Bei 
(1993) and Sweeney (1994) provide empirical support for the first assumption, whereas Press and Weint 
Duke and Hunt (1990) find evidence consistent with the second assumption. 

4 Most prior accounting method choice studies have also tested a management compensation hypoth 
dummy variable for the existence of accounting earnings-based bonus plans. However, Healy (198 
depending on plan details, certain accounting decisions would not affect managers’ bonus awards. Fc 
choice between comprehensive and partial allocation would not affect managers’ bonus awards if suc 
based on earnings before taxes. Hence, bonus plan disclosures in the financial statements and 10K of th 
in this study were examined. Of the 35 firms mentioning the definition of earnings used to compute bo 
a before-tax basis and only one an after-tax basis. Hence, the bonus hypothesis was not pursued furth: 
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empirical evidence holds only for the largest firms (Zmijewski and Hagerman 1981) and appears 
driven by firms in the oil and gas industry (Zimmerman 1983). 

Because taxes are one element of the total political costs borne by firms, this study uses firms’ 
effective tax rate (ETR) to surrogate for political costs. With the exception of El-Gazzar et al. 
(1986), ETR remains largely unexplored as a proxy for political costs in studies of accounting 
choice in the U.S. Assuming taxes are not systematically offset by the non-tax components of 
political costs (e.g., regulation, quotas, and tariffs), firms subject to high political costs would be 
more conservative in their tax and accounting choices resulting in higher ETRs and adoption of 
income-decreasing accounting methods.’ In terms of this study it can be hypothesized that, ceteris 
paribus, DISC firms with high ETRs are less likely to choose partial allocation. However, 
Zimmerman’s (1983) evidence suggests that the ability of ETRs to proxy for political costs may 
vary across time and industries. Hence, ETRs will be supplemented with other political cost 
variables used in prior research, such as concentration ratios. 


Hypotheses Based on Efficiency Arguments 
DISC Distribution Likelihood Hypothesis 


Apart from the contracting and political cost arguments that focus on managerial opportun- 
ism, another explanation for accounting choice is provided by the efficient (or optimal) 
contracting perspective (Holthausen 1990; Watts and Zimmerman 1990). Under this perspective, 
managers are viewed as choosing accounting methods from an accepted set forefficiency reasons, 
i.e., to minimize contracting costs and maximize firm value. In the context of this study, DISC 
firms’ accepted set of accounting methods can be viewed as a function of the likelihood of reversal 
of the tax benefits in the future. If firms expected reversal, they had to accrue taxes on the timing 
difference, i.e., their accepted set was constrained to comprehensive allocation. If they did not 
expect reversal, they could choose either comprehensive or partial allocation. 

Controlling for variations in firms’ accepted sets is difficult empirically and, hence, evidence 
on how this affects accounting choice is limited (Watts and Zimmerman 1990). However, Watts 
and Zimmerman (1990) also contend that failure to control for the cross-sectional differences in 
firms’ accepted sets induces a correlated omitted variables problem because the accepted set of 
accounting methods is one part of the firms’ implicit and explicit contracts. In this study, firms’ 
likelihood of DISC reversal might depend on their financial position, specifically cash flows in 
the period following establishment of the DISC, which in turn is likely to be correlated with the 
contracting and political cost variables. However, at least two opposing responses are possible. 
On the one hand, firms with greater expected cash needs might expect to withdraw DISC profits 
and, therefore, would be more likely to choose comprehensive allocation. On the other hand, 
assuming firms’ cash needs can be met by other means, firms with poor cash flows might have 
a stronger motivation to maintain DISC status, thereby avoid triggering the reversal of tax benefits 
and the need to pay those taxes currently. In that case, such firms would choose partial allocation. 
In any event, itcan be hypothesized that DISC firms’ accounting method choice is related to some 
measure of the DISC distribution likelihood.® 


5 Alternatively, higher ETRs of larger firms, which are more established and diversified, could be due to the high costs 
of adjusting to an efficient tax equilibrium, or simply from investing in tax-disfavored activities (Scholes and Wolfson 
1992). 

$ [thank an anonymous reviewer for suggesting this hypothesis. It should be noted, however, that it was possible for parent 
firms to get access to the DISC's tax deferred funds through “producer loans.” Proceeds from producer loans could only 
be used by borrowers for investments in plant and equipment, research and experimentation expenditures, and purchases 
of inventory, all of which supported the export business. The hypothesis in the paper implicitly assumes that these 
restrictions effectively limited parent finms’ ability to access DISC funds without paying the DISC deferred taxes. 
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The Investment Opportunity Set Hypothesis 


The efficiency-based perspective (Watts and Zimmerman 1986, 1990) also suggests that 
firms’ accounting method choices are related to their investment opportunity sets (ios). More 
generally, Smith and Watts (1992) contend that major corporate policies (including accounting 
choices) are related to the ios and to each other. Thus, failure to include the ios in an accounting 
choice study also results in an omitted variables problem. Although evidence consistent with the 
ios impacting firms’ financing, dividend, and compensation policies has been found in recent 
studies (Smith and Watts 1992; Gaver and Gaver 1993), evidence on the effect of ios on 
accounting method choice using U.S. data appears limited to Skinner (1993), who finds support 
for this effect in some accounting choices but not others. 

Conceptually, the ios-corporate policies relation is based on Myers’ (1977) notion of a firm 
as a combination of assets in place and future investment (growth) options. Specifically, Myers 
argues that firms with more assets in place (i.e., fewer growth opportunities) are expected to have 
higher leverage because they are less subject to the underinvestment and asset-substitution 
problems associated with risky debt." Following the debt covenant arguments advanced earlier, 
Skinner (1993) extends the ios-leverage linkage to suggest that firms with more assets in place 
are more likely to choose income-increasing accounting methods. In addition to this indirect 
effect, ios are believed to have a direct effect on accounting procedure choice via firms' accepted 
set of accounting methods (Watts and Zimmerman 1990). Specifically, because growth firms' 
assets are more difficult to observe, managers have greater flexibility to act opportunistically ex 
post. Hence, the accepted sets of accounting procedures for such firms are likely to restrict 
managers' ability to choose income-increasing accounting procedures ex ante. However, restrict- 
ing the accounting choice set for growth firms may also be more costly, making it difficult to 
specifically predict the direction of the ios-accounting choice link. In any case, based on the above 
discussion it can be argued that DISC firms' accounting method choice likely is related to their 
investment opportunity sets. 


The Auditor Hypothesis 


Typically, the accounting choice research has focused on the economic consequences to 
managers, stockholders, and bondholders. However, other stakeholders (e.g., external auditors) 
also are involved in firms' financial reporting decisions and their preferences may affect the 
choice of accounting methods. The role of external auditors is of particular interest in this study 
because evidence suggests that certain auditing firms had strong opinions on the interperiod tax 
allocation issue. Comment letters to the APB indicate that opinions among the then Big 8 CPA 
firms were polarized between Price Waterhouse & Co. (PW) supporting partial allocation and 
Arthur Andersen & Co. (AA) favoring comprehensive allocation. PW's opinion was based on 


? The underinvestment problem is associated with managers foregoing positive net present value projects because most 
of the payoffs go to the bondholders. The asset-substitution problem arises when managers substitute higher variance 
for lower variance assets once debt has been issued. Assuming the debt has been issued and priced on the basis of low 
variance assets, the substitution causes wealth transfers from bondholders to stockholders. In both instances, managers 
are assumed to act in the best interests of stockholders. 

* Unlike written comments to the FASB, comment letters on exposure drafts of APB Opinions are not systematically 
housed anywhere. The APB 11 comment letters were traced to the Chicago office of Arthur Andersen, whose 
cooperation is gratefully acknowledged. Whereas none of the other Big 8 firms expressed their opinions as clearly or 
publicly as AA and PW, their letters suggest that Ernst & Ernst, Peat Marwick, Touche Ross, and Coopers & Lybrand 
supported AA’s position on comprehensive allocation, while Haskins & Sells supported PW's position. No evidence 
was found on Arthur Young's position. Although this information was used in the empirical analysis for sensitivity 
purposes, problems in inferring causality of the auditor effect (discussed later) are greatly reduced when the positions 
of only AA and PW are considered. 
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a study of its 100 large client-corporations that was widely quoted in the financial press and in 
several comment letters to the APB, whereas AA’s well-known support for comprehensive 
allocation was expressed in journal articles and letters to the APB.’ Finally, PW's and AA’s 
position on the extent of interperiod tax allocation was known as early as 1967, whereas DISCs 
became available for the first time in 1971. Based on the above reasoning, it can be hypothesized 
that, ceteris paribus, a DISC firm's choice between comprehensive and partial allocation is 
related to its external auditor's preferences. 

In contrast to prior studies (Thornton 1986; Trombley 1989), the well-documented position 
of some auditing firms on tax deferral and the timing of the DISC accounting method choice 
examined in this study provide circumstantial evidence suggesting the direction of the auditor 
influence. However, causality is difficult to infer because auditors' stated position may have been 
initially influenced by their clients at that time. The subsequent public disclosure of this position 
may simply reflect auditors’ commitment to the accounting treatment advocated earlier based on 
their clients' preferences. 


IV. DATÀ AND EMPIRICAL PROCEDURES 
Sample Selection and Classification of DISC Accounting Method 


Firms with DISCs were identified from the 1972, 1973, and 1974 annual report files of the 
NAARS database. The 1972 file was the appropriate starting point because DISCS first became 
available for tax years beginning January 1, 1972. The search was limited to three years so that 
the study period was contained in one tax regime.” A total of 491 companies were identified as 
having a DISC in at least one of the three years searched. From this initial sample, 142 firms not 
listed on the annual Compustat (Industrial or Research) tapes and 29 firms with inadequate 
disclosures about the DISC accounting method being used were deleted. These exclusions 
resulted in a final sample of 320 firms, of which 238 (74.4 percent) were classified as partial 
allocators and 82 (25.6 percent) as comprehensive allocators. Panel A of table 1 summarizes the 
sample selection procedures. 

Classification of the sample firms' DISC accounting method choice was primarily based on 
tax footnote disclosures governed by ASR 149. Under these requirements, the expected 
disclosures were as follows: 


Nature of 
Type of Firm Book-Tax Difference Type of Disclosure 
PARTIAL ALLOCATOR PERMANENT DIFFERENCE ADJUSTMENT IN ETR RECONCILIATION 


COMPREHENSIVE ALLOCATOR TIMING DIFFERENCE SOURCE OF DEFERRED Tax 


? PW's study entitled, “Is Generally Accepted Accounting for Income Taxes Possibly Misleading Investors?" was 
published in The Wall Street Journal (July 21, 1967) and the Financial Executive (September 1967, 70—75). AA’s 
intense involvement was gleaned from a selected reading of several thousand pages of letters and inter-office 
memoranda between AA personnel from which it appears that Mr. George Catlett, then partner of AA, was extremely 
influential in the debate over accounting for income taxes. 

The Tax Reduction Act of 1975 eliminated DISC benefits from the export of depletable energy products and certain 
minerals. More importantly, the Tax Reform Act of 1976 allowed the DISC benefits only for incremental export income 
over a moving base period. 
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TABLE 1 
Sample Selection and Profile Analysis 


Panel A: Sample Selection 
Firms with Domestic International Sales Corporations (DISCs) identified from the 


1972, 1973, and 1974 files of NAARS 491 
Less: Firms not on the Annual Compustat (Industrial or Research) files 142 
Less: Firms for which DISC accounting method not determinable 29 
Final Sample 320 
Classification of sample firms by DISC accounting method" 
Partial Allocators 238 (74.4%) 
Comprehensive Allocators 82 (25.6%) 


Panel B: Profile Analysis by Sample Firms’ DISC Accounting Method Choice 


Partial Comprehensive Percent x 
Allocators Allocators Total  ofTotal: p(x?) 


Stock Exchange Listing 

AMEX 54 15 69 21.6 

NYSE 181 65 246 76.9 1.17 
OTC 3 2 5 1.5 (0.56) 
Industry Membership (one-digit SIC code) 

Non-durable goods manufacturers (2) 39 24 63 19.7 

Durable goods manufacturers (3) 168 43 211 65.9 9.36 
Other (1,4,5,6,7,8) 31 15 46 14.4 (0.01) 
Year When DISC Operational (“DISC-Year”) 

1972 124 20 144 45.0 

1973 86 40 126 39.4 21.74 
1974 28 22 50 15.6 (0.00) 


Panel C: Selected Financial Attributes (in millions of $) by Sample Firms’ DISC Accounting Method Choice’ 


Partial Allocators Comprehensive Allocators 
Financial Variable Mean Std. Dev. Median Mean Std. Dev. Median 
Total Assets 346.342 782.952 74.967 219.713 374.862 87.464 
Net Sales 416.384 1018.409 106.846 286.365 419.076 140.056 
Long-Term Debt 70.859 155.504 12.330 50.557 130.101 11.245 
Deferred Taxes 6.152 20.215 0.470 5.854 16.811 0.587 
Pretax Income 28.865 87.280 6.323 23.441 38.600 9.564 


* The classification is based on the extent to which sample firms accrued taxes on their indefinitely deferred DISC 
earnings (see text for details). ' 
> Based on data for the year prior to the sample firms’ DISC-year. 
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Actual disclosures by the sample firms varied considerably, however, ranging from short 
statements in the footnotes to detailed dollar effects of the DISC accounting choice. 


Profile Analysis 


To describe their composition, panel B of table 1 provides a classification of the sample firms 
by: (1) their stock exchange listing, (2) their industry membership (based on one-digit SIC 
codes),!? and (3) the year when their DISC was first operational ("DISC-year"). This classifica- 
tion also allows one to determine whether the sample selection procedure resulted in different 
subsets of firms. As the table shows, over three-fourths of the sample firms are listed on the New 
York Stock Exchange, about 86 percent are in manufacturing, and nearly half had a DISC 
operational in 1972. 

The preponderance of DISCs in manufacturing is to be expected given the nature of the DISC 
legislation. However, it is not obvious as to why durable goods manufacturers chose partial over 
comprehensive allocation much more frequently (about 4:1) than other industry groups (less than 
2:1). One possible explanation is that industry membership is correlated with firm size (Ball and 
Foster 1982). Based on correlations reported later, durable goods manufacturers indeed are 
relatively smaller and in less concentrated industries, which is consistent with the political cost 
argument that larger firms tend to adopt income-decreasing accounting procedures. An industry 
association is also a source of concern in any accounting choice study because the set of accepted 
methods is likely to vary by industry (Watts and Zimmerman 1990). Moreover, an industry 
association is important in this study because auditors, especially the larger firms, are known to 
specialize in certain industries to distinguish themselves from their competition. All of these 
factors motivate including a control for industry membership to avoid misleading inferences. 

Panel B of table 1 also suggests that all ofthe sample firms may not have setup DISCS in 1972, 
the year they first became available. Sample firms were considered to have DISCS operational in 
the year whenfirst mentioned in NAARS (their DISC-year). The declineinthe proportion of firms 
choosing partial allocation from 86 percent in 1972 to 56 percent in 1974 raises the possibility that 
firms setting up DISCs in later years may have perceived a shorter shelf-life for the DISC deferral. 
Thus, the year of DISC adoption might reflect firms’ expectations regarding DISC reversal. This 
suggests the inclusion of year dummies, which would also control for other macro-economic 
influences. 

Panel C of table 1 presents distributions of selected financial attributes of the sample firms. 
Based on median values, partial allocators appear to be smaller in size (assets or sales) and have 
less debt and deferred taxes, relative to comprehensive allocators. However, the opposite is true 
using mean values of these variables, suggesting the presence of skewness and motivating 
appropriate transformations, especially for firm size. 


11 Several sample firms presented the DISC tax effect both as a timing difference (implying comprehensive allocation) 
and as a permanent difference (implying partial allocation). See Gupta (1990) for examples of these disclosures. These 
disclosures are consistent with: (1) the DISC having a fiscal vear different from its parent corporation and the parent 
being a partial allocator; (2) deferred taxes being accrued on only a portion of the indefinitely deferred DISC earnings; 
and (3) firms having multiple DISCs with different accounting policies for different DISCs. The first explanation is 
believed most likely because Treasury studies found that most DISCs' accounting period lagged slightly behind their 
parents’ accounting period (Hartzok 1983). This postponed by as much as one year the inclusion of DISC income in 
tbe parent's tax return and bence payment of taxes on that income. The second explanation is considered unlikely based 
on firm disclosures and Marocco (1985). While the third explanation is plausible, none of the sample firms disclosing 
multiple DISCs mentioned using different accounting policies. Hence, these firms were classified as partial allocators. 

7 Analysis by a finer classification scheme was not meaningful because the sample firms fell in 153 different 4-digit and 
37 different 2-digit SIC classes. 
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Empirical Model and Variable Definitions 


Based on the hypotheses developed earlier, a conceptual model of firms’ interperiod tax 
allocation choice with respect to the DISC deferral is assumed to be a function of its debt 
covenants, public debt, efficiency-based variables, political costs, auditor position, and various 
control variables. To test the descriptive validity of these hypotheses, empirical proxies for these 
constructs are required. The specific variables selected for this purpose are presented below, 
along with their definition and the manner in which certain measurement issues affecting them 
were resolved. 

Data to measure the variables were obtained from Compustat, and supplemented from annual 
reports, 10Ks, and Moody's Industrial manuals. Because the NAARS disclosures mentioned 
above indicate that sample firms may have different DISC-years, the explanatory variables were 
measured for the year prior to the DISC-year. Another approach was to assume that all firms had 
DISCs operational in 1972, the year when they first became available, and use 1971 data for all 
firms. Measuring the variables prior to when the accounting choice was available is desirable as 
it helps mitigate endogeneity problems inherent in accounting choice studies. This is especially 
important in this study for variables such as effective tax rates, which otherwise would obviously 
be affected by the tax allocation choice. Both approaches produced qualitatively similar results, 
so results presented in the paper are based on the first approach. 


Debt covenant variables 


Of the various types of bond covenants, financing-related constraints and/or restrictions on 
payment of dividends are most frequently observed (Shevlin 1987; Duke and Hunt 1990; Press 
and Weintrop 1990). Hence, three variables, leverage (LEV), interest coverage (INTCOV), and 
dividend coverage (DCOV), were included to test the debt covenant hypothesis. 

LEV, defined as long-term debt divided by common equity (both based on book values), was 
used as a surrogate for proximity to debt covenant restrictions and/or the probability of default 
on debt agreements, which is supported by evidence in Duke and Hunt (1990) and Press and 
Weintrop (1990). Although many different measures of leverage have been used in prior 
accounting method choice studies, the descriptive statistics and correlations (not reported here) 
indicated that for this sample these other measures were essentially similar and not affected by 
whether capitalized lease obligations, preferred stock, and/or deferred taxes were considered debt 
(Gupta 1990). 

INTCOV, defined as the sum of income before extraordinary items and interest expense 
divided by interest expense, provides a measure of the extent to which interest charges payable 
to creditors is covered by current earnings. Finally, DCOV, defined as common dividends divided 
by unrestricted retained earnings (URE), provides a direct measure of nearness to possible 
dividend covenant violations. URE represents an inventory of payable funds and thus the inverse 
of DCOV can be interpreted as the number of years a firm can continue paying dividends at the 
current levels assuming no additional earnings or losses. 

If managers act opportunistically to avoid or loosen debt covenant constraints, then firms 
with higher LEV, lower INTCOV, and higher DCOV are more likely to adopt the income- 
increasing partial allocation method to account for the DISC tax deferral. There are some 
measurement issues concerning the interpretation of the two ratio variables (INTCOV and 
DCOV), however, that deserve mention. Problems arise ifthe denominators of ratios are negative, 
zero, or relatively small in magnitude. Hence, certain coding procedures were used for these two 
ratio variables to avoid the effects of outliers dominating the results and which are similar to prior 
research (Bowen et al. 1981; Daley and Vigeland 1983; Shevlin 1987). The results A aa 
were generally insensitive to these coding procedures: 
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e INTCOVs less than zero and greater than 100 were reset to 0 and 100, respectively; and 
INTCOV of nine firms with zero interest expense but positive income was reset to 100 (the 
maximum). 

* DCOV: for 15 firms with zero URE, DCOV was set to three, which is greater than the 
maximum DCOV (2.89) of any sample firm with positive URE. For 129 firms for which 
URE was not available, the method of coding DCOV depended on whether it could be 
determined if a dividend restriction existed. For the 77 (out of 129) firms with no dividend 
restriction mentioned, DCOV was coded zero on the assumption that either there was no 
restriction or if one existed, it was immaterial; for 52 (out of 129) firms with a dividend 
restriction mentioned, retained earnings (RE) was substituted for URE if RE was positive, 
or DCOV was coded three if RE was zero or negative. 


Public debt variable 


As over 68 percent of the sample firms had zero or immaterial amounts of public debt, a 
dummy variable, PLEV (coded 1 if the firm had public debt), was used to test the renegotiation 
cost hypothesis. PLEV is expected to be positively associated with the use of the income- 
increasing partial allocation method. 


Political cost variables 


As discussed before, firms’ effective tax rates (ETR) were used as a proxy for political costs. 
ETR was defined as total income tax expense divided by pretax income, which corresponds to the 
ETR reported in financial statements. For sensitivity purposes, ETR was also calculated with only 
the current income tax expense in the numerator to correspond to the ETR measure used in the 
political arena. Results are presented based on the first ETR measure and sensitivity tests reported 
with the other measure. 

In addition to ETR, firms’ concentration ratio (CONC) was also included to capture political 
costs. CONC is calculated as sales of the four largest firms in each 4-digit SIC code divided by 
sales of all Compustat firms in the same SIC code. CONC was also defined as an eight-firm 
concentration ratio, with similar results. Following the long-standing argument that firms in more 
concentrated industries face greater political costs, CONC is expected to be negatively associated 

with partial allocation. 


Efficiency-based variables 


The first efficiency-based variable used is WCOP, firms’ working capital from operations, 
defined as working capital from operations scaled by total assets averaged over the five-year 
period following the sample firm's DISC-year.? WCOP proxies for managements’ expectations 
regarding the reversal of DISC benefits in future years. However, as discussed before, the sign 
on WCOP cannot be predicted unambiguously. : 

The second efficiency-based variable is a measure of firms’ investment opportunity sets. The 
problem is that the ios are unobservable and no consensus exists in the literature on the appropriate 
empirical proxy. Hence, two different variables, PPE/MVE and EPRATIO, that have been 
employed in recent studies were used (e.g., Smith and Watts 1992; Gaver and Gaver 1993; 
Skinner 1993), PPE/MVE is defined as gross property, plant and equipment divided by the sum 
of the market value of equity and book value of debt, and EPRATIO is the earnings-price ratio. 


This definition was chosen because: (1) WCOP is not affected by the reporting choice, and (2) WCOP captures future 
information about the firm that is required by the hypothesis. Although excluding the cash flows of the DISC itself from 
WCOP would have been desirable, this was not possible as separate financial statements for DISCs were not available 
(i.c., DISCs were treated on a consolidated basis). 
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Both PPE/MVE and EPRATIO are based on market-value measures, which appeal 
most frequently in the literature, and are increasing (decreasing) with assets in ple 
opportunities). Two other measures, market-to-book assets and market-to-book equit 
used but they yielded results qualitatively similar to those with PPE/MVE. As in p 
(Gaver and Gaver 1993; Skinner 1993), all of these measures are significantly con 
each other (correlations range from .33 to .92), which is consistent with these variable 
the same underlying construct. 


Auditor variable 


Two dummy variables (AA and PW) representing the stated positions of Arthu 
and Price Waterhouse are used to test the auditor hypothesis. AA (coded 1 if extern: 
Arthur Andersen) is expected to be negatively correlated with partial allocation, v 
(coded 1 if external auditor is Price Waterhouse) is expected to be positively correl 


Control variables 


Firm size is included as a control variable in this study because it is known to bx 
with various phenomena. Firm size may also proxy for the size of the tax benefits wh 
may affect firms’ accounting method choice. Support for this notion is provided t 
studies which showed that "large U.S. corporations with DISC subsidiaries were ! 
beneficiaries of the DISC provisions" (Hartzok 1983). Firm size is measured as 
logarithm of net sales (LSAL). For sensitivity purposes, total assets were also used in 
sales. 

Another control variable included is firm performance measured as return on as: 
Another facet of managerial opportunism is the tendency of poorly performing firm: 
switch to income-increasing accounting methods so as to mask such performance . 
evidence consistent with this argument (e.g., Lilien et al. 1988; Pincus and Wasley | 
is defined as income before extraordinary items plus depreciation and interest dividec 
of market value of equity and book value of debt. 

For reasons discussed earlier, control variables for industry membership and DIS‘ 
also included. Two dummy variables, DURB (coded 1 if firm is a durable goods m: 
and NDURB (coded 1 if the firm is a nondurable goods manufacturer) were used | 
membership. Finally, two year dummies, DYR72 (coded 1 ifthefirm's DISC-year we 
DYR73 (coded 1 if firm's DISC-year was 1973), were used to represent the years i 
sample firms' first had a DISC operational. 


V. RESULTS 
Descriptive Statistics and Univariate Analysis 


Descriptive statistics for the explanatory variables defined earlier and parameti 
parametric univariate tests of differences between these variables are presented 
Consistent with the contracting cost arguments, partial allocators are observed to h 
cantly higher leverage and lower interest coverage ratios compared to comprehensiv: 
However, the two groups do not differ in their dividend constraints or frequency of 
usage. Consistent also with the tax-based political cost hypothesis, partial allo 
relatively lower effective tax rates, but their concentration ratios are similar to con 
allocators. The evidence on efficiency-based variables is mixed. Partial allocators a 
to have relatively poor working capital from operations, which is consistent with 
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vanting to preserve their DISC tax benefits. However, the two groups do not differ in the 
xroportion of assets in place and in fact partial allocators have significantly lower earnings-price 
'atios, which is inconsistent with the argument that firms with fewer growth options are more 
ikely to use income-increasing accounting choices. Consistent with the auditor hypothesis, a 
significantly lower (higher) proportion of partial allocators had Arthur Andersen (Price Waterhouse) 
is their external auditor. Finally, while the two groups do not appear to differ in size, partial 
allocators have poorer profitability than comprehensive allocators. 

For sensitivity purposes, these tests were repeated on alternative definitions of the explana- 
‘ory variables as well as different coding procedures, with essentially similar results.^ To 
»xamine potential interactions, the tests were also repeated on subsamples of firms by the two 
industry groups (manufacturing durables and nondurables) and by each of the three DISC-years 
(1972, 1973, and 1974), with results generally similar to the full sample. Taken together, the 
results provide preliminary evidence consistent with both managerial opportunism and efficiency 
arguments as well as the auditor hypothesis developed earlier. 


Multiple Regression Analysis 


To examine the joint and partial effects of the explanatory variables on the DISC accounting 
method choice, and especially to examine the relative importance of competing explanations, 
multiple logistic regression models were estimated. The dependent variable in these regressions 
is the log-odds of the probability that the sample firm chooses the income-increasing partial 
allocation method to account for the DISC tax deferral. Maximum likelihood coefficient 
estimates (with standard errors in parentheses) for the independent and control variables for 
various models are presented in table 3. Statistical significance for the variables is denoted in the 
table based on asymptotic t-statistics.^ Model 1 simultaneously tests the predictions of the 
contracting costs, political costs, and auditor hypotheses, while controlling for the DISC reversal 
likelihood and firm size. Models 2 and 3, respectively, augment model 1 with additional controls 
for firms' investment opportunity sets and operating performance. In addition, model 4 includes 
industry membership and year dummies. Model 5 is similar to model 4 except in the ios variable 
used. Whereas PPE/MVE is used as the ios variable in models 2—4, EPRATIO is used in model 
5. Because negative earnings-price ratios are meaningless, model 5 was estimated only for firms 
with positive EPRATIOs. The specifications used in models 1—5 were chosen to allow an 
examination of the importance of the various hypothesized relationships in the presence of 
different combinations of the control variables. 

Toevaluatethe overall performance ofthe logit models, several statistics are reported in table 
3. Goodness-of-fit tests based on the likelihood ratio statistic indicate that all models are 


4 Specifically, the sensitivity tests included the following. First, INTCOV was tested by: (1) dropping the nine firms with 
zero interest expense but positive incorne (before extraordinary items), whose INTCOV was set to 100; and (2) not 
resetting INTCOV «0 and INTCOV>100 to 0 and 100, respectively. Second, DCOV was tested by (1) dropping the 52 
firms with dividend restrictions mentioned but URE not available (DCOV computed using RE); and (2) dropping the 
77 firms with no dividend restrictions mentioned (DCOV coded 0). Third, the actual magnitude of public debt was used 
instead of the public debt dummy. Fourth, because of difficulties in interpreting ETR<0 and ETR>1, the tests were 
repeated on the ETR measures constrained to lie in the (0,1) interval. Finally, AA and PW were redefined to include 
the other Big 8 auditors as discussed in footnote 7. The results of the univariate tests were essentiallv similar for all of 
the above (Gupta 1990). 

5 Stone and Rasp (1991) show that, with disparate response group sizes and skewed predictor variables (which is true of 
the data used in this study), the asymptotic t-statistics associated with the coefficient estimates of a logit model are 
“conservatively biased,” Le., the significance levels reported in table 3 are understated. 
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TABLE 2 


Descriptive Statistics and Univariate Tests of Differences in the Explanatory and 
Control Variables by the Sample Firms’ DISC Accounting Method Choice * 


Variable * 
LEV 
INTCOV 
DCOV 
PLEV 


ROA 


Partial Allocators Comprehensive Allocators 
Mean Median Mean Median I-stat Z-stat 
0.419 0.348 0.326 0.244 2.19 1.63 
9.687 3.852 17.309 5.175 —2.23 ~2.71 
0.333 0.035 0.271 0.065 0.70 -0.78 
0.319 0 0.305 0 0.24 0.24 
0.430 0.458 0.463 0.472 -2.35 —1.82 
0.805 0.841 0.775 0.804 1.20 0.76 
0.101 0.102 0.118 0.119 -3.15 -2.91 
0.625 0.561 0.692 0.583 -1.10 —1.18 
0.076 0.062 0.098 0.078 —2.05 —2.08 
0.189 0 0.305 0 -2.03 -218 
0.118 0 0.024 0 3.45 2.49 
4.741 4.671 4.838 4.942 -0.48 | -0.63 
0.081 0.080 0.094 0.093 —2.48 ~2.34 


* Based on data for the year prior to the sample firm’s DISC-year. Number of observations range from 223 to 238 for 
Partial Allocators and from 77 to 82 for Comprehensive Allocators (except for EPRATIO, defined below, which is based 
on data for firms with EPRATIO»0). 


^ Variable definitions (with Compustat item numbers in parentheses where applicable) are as follows: 


I iad wee ig 


long-term debt (9) / common equity (60) 

(income before extraordinary items (18) + interest expense (15)) / interest expense 

common dividends (21) / unrestricted retained earnings 

1 if public debt > 0, 0 otherwise 

total income tax expense (16) / pretax income (18+49+16) 

4-firm concentration ratio calculated as sales of the 4 largest firms divided by sales of all firms in the 
same 4-digit SIC code 

working capital from operations (110) / total assets (6), averaged over 5 years from the sample firm's 
DISC-year 

gross property, plant & equipment (7) / (market value of equity (24*25) + long-term debt) 
earnings per share (58) / closing price per share (24) 

1 if auditor is Arthur Andersen, 0 otherwise 

1 if auditor is Price Waterhouse, 0 otherwise 

log of net sales (12) 

(income before extraordinary items plus depreciation (14) and interest (15)) / (market value of equity 
+ long-term debt). 
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TABLE 3 
Coefficient Estimates (with Standard Errors in parentheses) of Logit Regressions * 


Dependent Variable: Natural Log of the Odds that the DISC’s Accounting Method Choice is 
Partial Allocation 


Model 
Variable * 
^ (expected sign) 1 2 3 4 5 
Explanatory Variables: 
Constant PH 3.426*** 4.226*** 2.622 3.119* 
(1.167) (1.209) (1.304) (1.600) (1.789) 
LEV (+) 0.521 0.529 0.793* 1.043* 1.034* 
(0.510) (0.535) (0.585) (0.649) (0.677) 
INTCOV (— —0.008* —0.014** —-0.014** —0.015** —0.013** 
(0.006) (0.007) (0.007) (0.008) (0.008) 
DCOV (+) -0.058 -0.051 0.001 -0.032 —0.062 
(0.210) (0.213) (0.223) (0.236) (0.240) 
PLEV (+) -0.171 -0.127 -0.115 —0.005 —0.004 
(0.394) (0.402) (0.407) (0.440) (0.446) 
ETR (-) —1.642 —1.610 —1.743 —1.944 —2.300 
(1.375) (1.344) (1.513) (1.743) (2.344) 
CONC (-) 0.482 0.321 0.274 —0.073 —0.192 
(0.729) (0.740) (0.743) (0.799) (0.812) 
WCOP (?) -8.800** -9.560*** —8.422** —6.661* —8.617** 
(3.630) (3.741) (3.757) (3.762) (3.983) 
PPE/MVE (7) -0.838** —0.004 0.149 
(0.355) (0.486) (0.516) 
EPRATIO (7) —4.542 
(3.767) 
AA (-) -0.573** —0.715** —0.812*** —0,869*** —0.822** 
(0.327) (0.338) (0.345) (0.369) (0.373) 
PW (4) 1.395** 1.315** 1.150* 1.437** 1.424** 
(0.768) (0.770) (0.780) (0.820) (0.823) 
Control Variables: 
LSAL —0.076 —0.039 —0.063 —0.012 —0.00 
(0.117) (0.120) (0.123) (0.134) (0.132) 
ROA -15.105*** ~12.706* —5.329 
(5.594) (6.812) (6.742) 
NDURB —-0.377 —0.501 
(0.554) (0.570) 
DURB 1.136** 0.977* 
(0.459) (0.469) 


(Continued) 
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TABLE 3 (Continued) 
Variable * Model 
(expected sign) 1 2 3 4 - 5 
DYR72 1.246** 1.029* 
(0.530) (0.553) 
DYR73 0.098 -0.036 
(0.469) (0.487) 
N 273 272 272 272 261 
LR 27.09 32.90 _ 40.66 65.82 60.98 
p(ZR) 0.0025 0.0005 0.0001 0.0001 0.0001 
LRI 085 404 .129 208 .198 
% correct 72.2 TELA 70.2 73.9 72.8 


* Based on data for the year prior to the sample firm's DISC-year. Models 1-4 are based on all firms with no missing data; 
and model 5 is based on the subsample of firms with EPRATIO > 0. ***, **, and * denote significance at the .01, .05, 
and .10 level based on asymptotic t-statistics. One-tailed tests are used for explanatory variables with directional 
predictions, whereas two-tailed tests are used for explanatory variables with no sign predictions and for all control 
variables. 


> Variable definitions are as follows (note: PPE/MVE is used as the investment opportunity set proxy in Models 2-4, and 
EPRATIO in Model 5): 


LEV = long-term debt / common equity 
INTCOV = (income before extraordinary items + interest expense) / interest expense 
DCOV = common dividends / unrestricted retained earnings 
PLEV =  lifpublic debt » 0, 0 otherwise 
ETR = total income tax expense / pretax income 
CONC =  4-firm concentration ratio calculated as sales of the 4 largest firms divided by sales of all 
firms in the same 4-digit SIC code 
WCOP = working capital from operations / total assets, averaged over 5 years from the sample firm's 
DISC-year 
PPE/MVE = gross property, plant & equipment / (market value of equity + long-term debt) 
EPRATIO = earnings per share / closing price per share 
AA =  lifauditoris Arthur Andersen, 0 otherwise 
PW =  lifauditor is Price Waterhouse, 0 otherwise 
LSAL = log of net sales 
ROA = — incomebefore extraordinary items before depreciation and interest / (market value of equity 
+ long-term debt) 
NDURB = i if nondurable goods manufacturer, 0 otherwise 
DURB = _ 1 if durable goods manufacturer, 0 otherwise 
DYR72 =  LifDISC-year is 1972, 0 otherwise 
DYR73 =  lifDISC-year is 1973, 0 otherwise. 


* The LR (likelihood ratio) statistic tests the null hypothesis that all coefficients (except the intercept) are zero. It is 
asymptotically distributed as chi-square and defined as: -2/ L(9) — L(12)], where L(®) and L(£2), respectively, are the 
constrained (all parameters, fs, except the intercept, are equal to zero) and unconstrained maximum values of the log 
likelihood function (Kmenta 1986, 550—556). 

i The LRI is a scalar measure similar to R? in the standard regression model and is defined as: 
1 -[L(Q)/ L(@)], where L(Q) and L(&) are defined as above (Kmenta 1986, 555—556). 


* Percentage of firms classified correctly is based on a jackknife procedure using a one-step approximation as described 
in the text. 
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significant at less than the..01 level, and that the explanatory power of the models as measured 
by the likelihood ratio index ranges from 8.5 to 20.8%.'6 Although the primary purpose of this 
study is descriptive, another goodness-of-fit indicator is the ability of the logit models to correctly 
classify firms in the appropriate groups. The classification accuracy of the models lies between 
70.2 and 73.9%, which is significantly better at the .001 level than the 61.9% predicted under the 
proportional chance criterion (Morrison 1969)."” 

The logit regression results for the individual explanatory variables are generally consistent 
with the univariate tests presented earlier. Specifically, table 3 shows that the probability of using 
partial allocation is positively related to LEV and negatively related to INTCOV, with both 
coefficients generally significant at the .10 level or less.'* These results are consistent with the 
managerial opportunism explanation that firms closer to potential debt covenant violations are 
more likely to use income-increasing accounting methods to loosen those constraints. However, 
no support is found for the dividend constraint measure (DCOV). The lack of significance for 
DCOV in this and other accounting choice studies is not surprising given that firms rarely violate 
negative covenants, such as restrictions on payments of dividends (Chen and Wei 1993; Sweeney 
1994).? In addition, no support is found for the renegotiation cost hypothesis tested with PLEV. 

Table 3 also shows that the probability of using partial allocation is negatively related to ETR 
in all models, with the coefficients generally significant at the .12 level (one-tailed). These results 
provide some support for the tax-based political cost hypothesis that greater tax burdens are 
reflective of higher political costs (Zimmerman 1983), which in turn increases the likelihood of 
choosing income-reducing accounting methods. However, the coefficient of CONC is never 
significant and sometimes is even positive, which is inconsistent with the claim that greater 
industry concentration typically leads to higher political costs and adoption of income-reducing 
accounting methods. 

The coefficient of WCOP, the variable used for the DISC distribution likelihood, is always 
negative and highly significant in all models. The negative coefficient suggests that firms with 
weaker cash flow from operations are more likely to choose partial allocation. This result is 
consistent with the argument that such firms want to preserve their DISC status and tax benefits, 


16 The likelihood ratio (LR) statistic tests the null hypothesis that all coefficients (except the intercept) are zero. Stone and 
Rasp (1991) show that with disparate response group sizes and skewed predictor variables, both of which are applicable 
to this study, the LR statistic is "anticonservatively biased." However, the reported LR statistics in table 3 are large 
compared to the critical value; hence, the bias is not considered a problem. The likelihood ratio index (LRJ) is a scalar 
measure similar to R? in the standard regression model. See also Amemiya (1981, 1502-1507) and Maddala (1983, 37— 
41) for other goodness-of-fit measures and see table 3 for definitions of LR and LRI. 

"The classification accuracy of the full model (4) for the two groups is 89.9 (30.1) % for the partial (comprehensive) 
allocators. Computations are based on a jackknife approach using a one-step approximation, which reduces bias 
resulting from classifying the same data from which the classification criterion is obtained (see SAS 1990, pp. 1091- 
1092). The benchmark used to evaluate the model’s classification accuracy is Morrison’s (1969, 158) proportional 
chance criterion (C) which is appropriate when one wishes “to correctly identify members of both groups.” Formally, 
Cu? o^-(1—ay, where o is the proportion of observations in Group 1 and (1-a) is the ion of observations in 
Group 2. Significance tests were performed using the Z-statistic, computed as (p — C^ lica ~C;)]/N , where p is 
the observed correct classification rate and C, is the chance criterion employed as the benchmark. The Z-statistic ranged 
from 2.82 to 4.07. An alternative to C... is the maximum chance criterion [C... = max (at, 1—0), with a as defined before] 
under which all sample observations are naively assigned to the larger of the two groups. Thus, under C , all sample 
firms would be classified as partial allocators and achieve a success rate of 74.196. The classification success achieved 
by estimated logit models is not better than C us 

18 Although, technically, the coefficients in the logit models describe the impact on the natural logarithm of the odds that 
a DISC firm uses partial allocation, the term probability is used throughout to refer to "the log of the odds.” 

19 Specifically, violation of dividend covenants is rare because firms typically have sufficient cushions of unrestricted 
retained earnings that allow them to continue paying dividends for several years. In this study, the median partial 
(comprehensive) allocator could pay out dividends et the current levels for over forty (12) years. 
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but inconsistent with these firms using the DISC profits to meet their cash flow nee 
WCOP remains negative and significant even in models that separately include a 
profitability (ROA), it is unlikely that WCOP is simply reflecting the tendency of les 
firms to use earnings-enhancing accounting procedures to mask their poor perf 
observed in prior studies (e.g., Lilien et al. 1988; Pincus and Wasley 1994). 

With regard to firms’ investment opportunity sets, the other efficiency-based \ 
coefficients of PPE/MVE (in models 2-4) or EPRATIO (in model 5) are generally not 
Moreover, they have the wrong sign in three out of the four models in which they a 
This does not provide evidence consistent with the ios having a direct or indirect effe 
accounting method choice as predicted in Skinner (1993). Together with the results 
the evidence in this study is only partially supportive of the efficiency-based argument 
choices could also reflect their economic expectations and be based on genuine 
considerations. 

Finally, table 3 results indicate strong support for the auditor hypothesis in this 
both auditor dummies significant in the hypothesized direction at less than the .05 
models. This relationship is observed even in models including industry dumr 
provides some assurance that the auditor variable is not simply proxying for an indus 
this study. 

As table 3 also shows, the sign, statistical significance, and magnitude o: 
coefficients are stable across all models regardless of controls for firm size, profitabil 
membership, and DISC-year. However, model 4, which includes all of these contn 
has more than twice the explanatory power and marginally higher classification accui 
other models, suggesting that these variables are important in explaining the DISC 
choice. Specifically, the probability of using partial allocation is increased if the firm 
goods manufacturer, its DISC-year is 1972, or it has low profitability. As mentionet 
reason for durable goods manufacturers to disproportionately favor partial alloc 
obvious; however, some possible reasons are discussed below. The significance 
dummies could also reflect firms’ expectations regarding survival of the DISC pr 
reversal of the tax benefits. Finally, the significance of ROA is consistent wit 
opportunistic behavior aimed at masking poor performance. 


Additional Sensitivity Tests and Correlation Analysis 


Three principal sensitivity tests were performed to evaluate the robustness of 
First, alternate definitions were used for certain explanatory variables that arguat 
defined in fundamentally different ways. Second, the analysis was performed usin 
1971 financial statements on the assumption that all firms established DISCs in t 
available, even though disclosure occurred in a subsequent year’s annual report 
analysis was also repeated on a subsample of firms that were believed to have mz 
benefits. In general, the results are similar to those reported earlier. 

If leverage is defined as total debt divided by net tangible assets, then LEV is s 
the hypothesized direction at the .12 level and WCOP is also significant at the .12 le 
ROA is negative and significant at .05. If ETR is defined to include only current ta: 
the numerator instead of total tax expense, its significance is reduced and the res 
sensitive to outliers— the significance levels are less than .20 if ETR is constrained tc 
O and 1 but greater than .25 if not constrained. If the numerator of ROA is define 
without adding back depreciation and interest, ROA itself is significant at less than 
but LEV is significant at the .12 level. If firm size is defined as log of total assets, there 
in either the size variable itself or the other variables. In all of the above variations 
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model remains significant and has similar explanatory power and the sign, significance, and 
magnitude of the other coefficients are unaffected. All of these tests are based on model 4 which 
includes all of the explanatory and control variables. 

As discussed before, rather than use data for the year prior to when the DISC was first 
mentionedin NAARS, 1971 datacould have been used for all firms under the assumption that they 
all set up the DISC in the first year they became available. This approach has merit if firms actually 
set up DISCs in 1972 but simply disclosed details about them in later years either because ASR 
149, which required detailed tax footnote disclosure, was not in effect until December 1973, or 
because the DISC was made operational in a later year. Results using 1971 data yield very similar 
results, except that the significance of INTCOV and ROA was marginally lower and that of ETR 
and WCOP marginally higher. 

The results were also estimated on a subsample after deleting firms believed to have an 
immaterial DISC tax benefit. From the footnote disclosures, data were also collected on the dollar 
amount of the DISC tax benefit. However, these disclosures were not uniform and were missing 
for several firms. Assuming that firms not disclosing precise dollar effects had immaterial DISCs, 
the regression model was estimated on the remaining subsample of 214 firms. The explanatory 
power of this model was 24.3% and classification accuracy 76.3%, both higher than that for the 
full sample. In addition, the sign and significance of the explanatory variables were similar with 
one main difference— ETR was not significant at any reasonable level. 

Typically, the explanatory variables in accounting choice studies are endogenous (Watts and 
Zimmerman 1990). As discussed earlier, this is indeed likely for several of the variables in this 
study. These variables and DISC firms’ response to the interperiod tax allocation choice are 
jointly determined by some exogenous factors that are not directly observable. This endogeneity 
also introduces multicollinearity, which could affect the regression results discussed above. To 
examine this issue, pairwise correlations were computed and are presented in table 4.% 

While several of these correlations are statistically significant, their absolute values are 
generally low (less than 0.40). In addition to the correlations, tolerance levels based on Belsley 
et al. (1980) were computed and are presented in table 4. Only ROA, PPE/MVE, and NDURB 
have tolerance levels less than .50 and, as table 3 shows, excluding these variables from the 
regressions does not change the inferences for the included variables. Overall, the data do not 
appear to suffer from harmful multicollinearity. 

The correlations in table 4 yield additional insights on certain relationships of interest, 
especially the insignificant variables in the regression models. For example, the high positive 
correlation between PLEV and LSAL is consistent with the claim that accessing capital markets 
and borrowing from the public entails higher costs and greater resources, which larger firms may 
be better able to afford. Similarly, the significant negative correlation between LSAL and DURB, 
together with the earlier finding that a greater proportion of durable goods manufacturers chose 
partial allocation, provides indirect evidence consistent with the firm size hypothesis that larger 
firms tend to adopt income-decreasing accounting procedures. The low correlation between 
LSAL and the industry dummies also suggests that Ball and Foster’s (1982) concern that firm size 
could proxy for industry membership may not be a problem in this study. Finally, the significant 
positive correlation between LSAL and ROA lends some credence to the contention that larger 
firms tend to attract greater political attention partially because they are more successful, although 
political scrutiny is a highly time-specific phenomenon. 


2 Spearman rank correlations are presented because of skewness in the distribution of several variables. Pearson product- 
moment correlations were similar in sign, significance, and magnitude for most variables, except the correlations of 
DCOV and ETR with the other variables were generally lower. 
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Although no significance was observed for the ios variables in the regressions, the significant 
positive correlation between PPE/MVE and both LEV and PLEV is consistent with Myers’ 
(1977) prediction and recent empirical evidence (e.g., Smith and Watts 1992; Gaver and Gaver 
1993; Skinner 1993) that firms with more assets in place will have greater financial leverage. 
Thus, while the ios may have an important indirect effect on firms’ accounting method choice 
through financial policies, their direct effect on accounting method choice may be weak or may 
require better proxies. 


VL SUMMARY AND CONCLUSIONS 


The purpose of this study was to examine firms’ motivations for choosing between com- 
prehensive and partial methods of interperiod tax allocation. The empirical analysis simulta- 
neously included variables motivated by the managerial opportunism perspective based on 
contracting and political cost arguments, and by the efficient contracting perspective on 
accounting method choice. The findings allow an examination of the relative importance of these 
explanations in describing accounting practice. 

Results of the empirical analysis are based on a sample of 320 firms with a Domestic 
International Sales Corporation (DISC) operational in 1972-74. The principal results and the 
inferences that can be drawn from them are as follows. First, consistent with the contracting cost 
arguments, firms with higher leverage and lower interest coverage ratios were found to be more 
likely to adopt the income-increasing partial allocation method. Some support was also found for 
the tax-based political cost hypothesis that firms bearing higher tax burdens (manifested through 
higher effective tax rates) are less likely to adopt partial allocation. However, firms with low 
profitability were more likely to use partial allocation. These results are consistent with managers 
acting opportunistically to avoid potential debt covenant violations, political scrutiny, or to mask 
poor performance. 

Second, with regard to the efficiency-based variables, firms' accounting method choice 
appears to be related to their economic expectations. In particular, firms with weaker future cash 
flows were found to be more likely to use partial allocation, which is consistent with these firms 
wanting to preserve their DISC status and avoid Icsing the tax benefits. However, firms’ choices 
are not observed to be associated with variables surrogating for their investment opportunity sets. 
These results provide only partial support for the efficient contracting perspective of accounting 
procedure choice. 

Finally, firms' choice between partial and comprehensive allocation was found to be strongly 
related to their external auditors’ stated position on tax allocation. The significance of the auditor 
variable after controlling for both the managerial opportunism and efficient contracting perspec- 
tives of accounting choice is noteworthy as it suggests the importance of considering the role of 
other stakeholders besides bondholders, stockholders and managers in firms' reporting decisions. 
Examining auditor effects with more recent accounting issues can be a potentially fruitful 
extension of this study, although observing such diametrically opposed positions as in the tax 
allocation issue studied here could be more difficult today, perhaps due to the increased 
litigiousness and competition in the current environment faced by auditors. 

Like other positive accounting studies, this study's results are subject to certain caveats and 
suggest directions for future research. First, while this study focuses on a single accounting 
method choice, it is likely that managers evaluate their entire portfolio of choices simultaneously. 
If that is indeed the case, the power of the tests in this study is reduced, although prior studies using 
the portfolio approach still achieve relatively low explanatory power (Zmijewski and Hagerman 
1981; Press and Weintrop 1990). Second, the explanatory variables in this and other accounting 
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method choice studies are generally endogenous. This problem is particularly applicable to the 
effective tax rate and auditor variables in this study, making it difficult to draw inferences 
regarding cause and effect. Although circumstantial evidence suggests potential directions in this 
study and measuring these variables prior to the availability of the accounting choice mitigates 
some of the endogeneity, further corroboration is required in other contexts. Finally, the lack of 
support for an association between firms’ investment opportunity sets and their accounting 
procedure choice provided by this study, together with similar results in Skinner (1993) and 
Pincus and Wasley (1994), suggests that either the relation is weak or requires better empirical 
proxies than used in the literature thus far. 
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ABSTRACT: We study whether capital expenditures provide value relevant infor- 

mation which is. incremental to that of current eamlngs. Models in accounting or 
finance generally predict that investments such as capital expenditures yleld 
information about a firm's future earnings that is not captured by current earnings, 

as managers respond to private information about future demand and costs through 
their investment decisions. Empirical research, however, has not provided consis- 

. tent, strong evidence ofthis effect. After controlling for concurrent earnings informa- 
- tion and size-related predisclosure information differences, we find that capital 
expenditures changes are strongly and positively assoclated with excess retums. 


Key Words: Earnings response coefficient, Capital expenditures response coeffi- 
clent, Predisciosure information environment. 


Data Availability: Data used in this study are available in the annual industrial 
Compustat files and the Center for Research in Security Prices 
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L INTRODUCTION 


esearch on the earnings response coefficient (ERC) has shown that factors such as size, 
R risk, and growth are important to the valuation of the firm’s earnings (e.g., Atiase 1985, 

1987; Collins and Kothari 1989; Easton and Zmijewski 1989). Since earnings are 
reflections of the firm’s investments, it seems natural to expect that the valuation of the underlying 
investments is also sensitive to these and other relevant factors. Few studies, however, have tested 
whether this is true. 
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Our work examines the value relevance of changes in firms' capital expenditures. A 
traditional view is that capital expenditures provide information to the market about a firm's 
future earnings that is not captured by current earnings, as managers respond to private 
information about future demand and costs through their investment decisions. Managers give a 
positive (negative) signal about the firm's available positive net present value projects when they 
unexpectedly increase (decrease) their capital expenditures (e.g., Beaver et al. 1980; McConnell 
and Muscarella 1985; Trueman 1986). An alternative view is that investments do not provide 
incremental information beyond current earnings about future earnings growth opportunities 
(e.g., Collins and Kothari 1989; Livnat and Zarowin 1990; Morck et al. 1990).! 

The valuation of capital expenditures is thought to be related to individual firm factors.? 

Failure to control for firm differences can result in seemingly weak capital expenditures response 
coefficients as previous work has shown (e.g., McConnell and Muscarella 1986, footnote 18; 
Livnat and Zarowin 1990). The firm's sector or industry membership has been used to capture 
certain individual firm differences in capital expenditures (e.g., Healy, et al. 1992; Lev and 
Thiagarajan 1993). This approach assumes that the relevant cross-sectional differences lie in 
factors that affect only the valuation of a particular subgroup of securities. It has been shown, for 
example, that firms which have a higher (lower) change in capital expenditures than their industry 
provide a positive (negative) valuation signal (e.g., Lev and Thiagarajan 1993). 

To highlight the value relevance of individual firm factors, we incorporate variables which 
mediate the effect of the firm's risk, growth, and earnings levels on the capital expenditures 
response coefficient (CRC). Initially, when we do not control for size-related differences, the 
CRC is marginally significant only in the presence of the mediating variables. This suggests that 
unexpected capital expenditures provide a value relevant signal for firms containing certain 
characteristics. However, when we control for size-related differences in predisclosure informa- 
tion (i.e., by adjusting the lead-lag structure of returns according to the size of the firm), the capital 
expenditures response coefficient becomes much larger and statistically significant in all of our 
models.’ In addition, the mediating variables become statistically significant with the expected 
signs. 


II. HYPOTHESES DEVELOPMENT 


Our primary focus is on the direction and magnitude of firm value changes when there are 
unexpected capital expenditures. We test whether unexpected capital expenditures provide value 
relevant information which is incremental to that of current earnings. If the market anticipates the 
manager's investment decisions from prior or existing information, then it seems reasonable to 
expect that realized changes in capital expenditures would not be value relevant. All the relevant 
information pertaining to investments would already be impounded in prices. Alternatively, if 
managers obtain private information about future demand and costs which motivates their 
investment decisions, then changes in capital expenditures are likely to be value relevant. 
Increases in capital expenditures would signal good news about available positive net present 
value projects while capital expenditures decreases would signal the opposite. 


! Under the so-called active informant hypothesis, for example, investment decisions simply reflect existing and known 
information in the market (e.g., Morck et al. 1990). 

* Growth, for example, has been theorized to be an important factor (e.g., John and Mishra 1990). 

3 Consistent with Livnat and Zarowin (1990), who also use an association study design along with a less restrictive sample 
than ours, we find that the capital expenditures response coefficient is insignificant when individual firm differences 
in the valuation of capital expenditures are not controlled for. 
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Under the first hypothesis below, changes in capital expenditures provide incremental 
information content beyond current earnings which is positively linked to future earnings. The 
null hypothesis (not shown below) is that capital expenditures do not reflect private information 
about future demand and costs, as managers respond predictably to information already 
incorporated in security prices. 


H,: The coefficient on unexpected capital expenditures is positive and significant in the 
returns equation. 


It has been argued and shown that individual firm factors such as risk and growth mediate the 
market's response to current earnings. Since earnings are reflections of the firm's investments, 
it seems natural to consider whether the valuation of the underlying investments in capital 
expenditures is also sensitive in the same direction to the firm's risk, growth and other relevant 
factors. Higher (lower) systematic risk, for example, is expected to result in smaller (larger) net 
present values of dividends from a given unexpected increase in current earnings. We test whether 
risk has a similar effect on the net present values of future earnings ene dividends associated with 
unexpected capital expenditures. 


H,; The coefficient on the interaction variable between unexpected capital expenditures and 
the risk factor is negative in the returns equation. 


The ERC is hypothesized to be a positive function of the firm's growth opportunities, i.e., 
future earnings and dividend streams stemming from current earnings surprises are thought to be 
larger in the presence of better growth opportunities, so we also consider whether firms with better 
growth opportunities receive more benefits from a given increment of capital expenditures. This 
leads to the following hypothesis: 


H,: Thecoefficient on the interaction variable between unexpected capital expenditures and 
the growth factor is positive in the returns equation. 


Finally, we test whether the firm’s expected rate of return on existing assets mediates the 
market’s expectations on future investment returns.‘ Firms, for example, with higher (lower) 
expected earnings on a given level of assets are anticipated to have larger (smaller) future returns 
on their capital expenditures. This leads to the following hypothesis: 


H, The coefficient on the interaction variable between unexpected capital expenditures and 
expected earnings on a given level of assets is positive in the returns equation. 


IIl. RESEARCH DESIGN 
Sample 


Our study focuses primarily on manufacturing firms (i.e., firms in the 1000 to 3999 SIC range 
of industries) because they are likely to invest more heavily in property, plant and equipment. 
The sources of our variables are Compustat and CRSP. (We provide the details in the appendix 


* Current rates of growth and return have different valuation implications in the Gordon growth model (e.g., Copeland 
and Weston 1988), as rates of growth and return are considered separate (though related) factors in the valuation of the 
firm. 


* That is, capital expenditures represent a much higher proportion of future value than in other industries. 
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of the specific Compustat and CRSP items used to compute each variable.) The firms in our 
sample must satisfy the following conditions: 


1. availability of CRSP data from 1971-1990 and accounting information in Compustat for each 
year during the period 1976-1989;5 


2. December 31 fiscal year end; 


3. the absolute value of our dependent variable (CAR) is less than 10 (i.e., 1000 percent) and 
the absolute value of the estimated beta is less than five; 


4. regression diagnostic tests, primarily Cook's D-statistic, reveal the absence of any outliers 
(i.e., having Cook's D values of one or greater). 


Table 1 provides information on the industry composition of our sample of 153 firms used in 
regression analysis from 1976-1989. There are a total of 2142 firm years (i.e., 153 firms used over 
14 years) drawn from 22 different two-digit industries. Table 2 provides descriptive statistics for 
the sample over the test period. The unexpected portion of a firm's earnings and capital 
expenditures per share is estimated as the realized value in year t minus its realized value in year 
t - 1, scaled, as previous work has done, by the firm's market price at the beginning of yeart (e.g., 
Collins and Kothari 1989). As seen in table 2, unexpected capital expenditures (UCAP) has a 
value that is very close to zero, which is consistent with a random walk model. 


Proxies for Risk, Growth and Expected Earnings 


We develop factors for risk and growth following earlier approaches (e.g., Cho and Jung 
1991). Forexample, we use the firm's beta (calculated from 60 months of returns data prior to the 
beginning of the year to estimate the market model) as a proxy for the firm's risk; and the firm's 
market to book value of equity ratio as a proxy for the firm's growth.? Each year firms are assigned 
a value of one for each factor whose beginning of the year values exceed the median values of the 
sample; otherwise, firms are assigned a value of 0. In addition to risk and growth, we include a 
factor for expected earnings levels, where each year firms are assigned a value of one if their 
previous year's EPS (scaled by the firm's beginning of the year book value of equity per share) 
is greater than that of the midpoint sample observation in that year; otherwise, the firm is assigned 
a value of 0. The primary motivation for using dummy variables instead of continuous variables 
is that the continuous variables are measured with error (Collins and Kothari 1989), e.g., firms 
have temporarily higher or lower earnings than normal. 


Controlling for Size-Related Differences in the Lead-Lag Structure of Returns 


Previous research has shown that the information environment surrounding the firm is 
sensitive to the firm's size, i.e., larger firms have a greater following. Predisclosures on the firm's 
current earnings, for example, have been found to be significant starting in an earlier window for 


$ Our pooled regressions require a common set of firms over the entire period of study. We restrict our data to firms which 
have 20 years of available CRSP data (i.e., to develop market model parameters and excess returns information from 
1971 through the first three months of 1990) and Compustat data from 1976 through 1989 (i.e., the last available year 
of our Compustat data). There are 153 total firms selected. We tested the sensitivity of the results to the sample size by 
running annual regressions. The results were not affected as suggested by the average coefficients and their t-statistics. 

7 We use a December year-end criterion, similar to many previous studies (e.g., Livnat and Zarowin 1990), to facilitate 
data analysis, to enhance comparisons with previous studies that have imposed this restriction, and to ensure comparable 
return accumulation periods for all firms. 

® Previous work has used beta as a proxy for the firm's risk under the assumption that the firm's cost of capital is an 
increasing function of its CAPM beta risk. 
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TABLE 1 
Industry Distribution of the 153 Manufacturing Firms Studied over the 
Period 1976-1989" 
Number of 

Industry Name SIC . Firm Years 
Metal Mining 10 56 
Coal Mining 12 14 
Oil and Gas ` 13 42 
Non-metal Minerals 14 14 
Heavy Construction | 16 28 
Food Products 20 84 
Tobacco, Cigarettes 21 28 
Textile Mill Products 22 56 
Apparel —— 23 14 
Furniture 25 14 
Paper Products 26 168 
Publishing 27 84 
Chemical Products 28 364 
Petroleum Refining 29 154 
Tires and Plastics 30 98 
Glass, Cement, Gypsum 32 42 
Steel and Aluminum 33 168 
Metal Structures 34 70 
Machinery and Computers 35 224 
Electrical Equipment 36 112 
Transportation Vehicles 37 196 
Miscellaneous Instruments 38 112 

Total: 2142 


LÀ 


* Our industry sample is taken from manufacturing companies (i.e., which have an SIC of 1000-3999) which have both 
available CRSP data from 1971 to 1990 and Compustat data from 1976 to 1989. There are a total of 153 firms used for 
each of the 14 years of our analysis (1976—1989) totalling 2142 firm years. 


larger firms (e.g., Collins and Kothari 1989). The market seeks information on both near- and 
long-term earnings concurrently, so we also consider whether predisclosure information on 
capital expenditures is related to the firm's size. Failure to control for size-related differences in 
predisclosure information can understate the impact and value relevance of capital expenditures.? 

One way to address this problem is to adjust the interval over which CAR is determined. The 
size-related information predisclosure differences can then be captured by the use of the 


? Firms often know their capital expenditure plans in advance of the year that they buy their assets. The SEC, for example, 
requires disclosures by firms of their material capital expenditures “commitments” in the Management' s Discussion and 
Analysis of Financial Condition and Results of Operations (e.g., SEC Financial Reporting Release No. 36). 
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appropriate lead-lag structures of returns, To determine the appropriate CAR interval for different 
sized firms, we rely on previous work on the predisclosures of current earnings. The CAR for large 
and medium sized firms is measured over the 15 month period beginning in August of year t-1; 
and the CAR for small firms is based on the 15 month period beginning in November of year t- 
1.!° The view is that the timing of the information signals surrounding the firm is not indepen- 
dent.!! The determination of whether firms are small, medium or large is based on the beginning 
ofthe year market value of equity from which the sample is divided into three equal sized groups. 


Empirical Design 

We test the following cross-sectional models: 

Model 1: CAR, = f(UEPS, UCAP) 

Model 2: CAR, = f(UEPS, UCAP, UEPS*GTH, UEPS*RSK) 

Model 3: CAR, = f(UEPS, UCAP, UCAP*GTH,, UCAP*RSK,, UCAP*EEPS, ) 

Model 4: CAR, = f(UEPS, UCAP, UEPS*GTH, UEPS*RSK,, UCAP*GTH,, UCAP*RSK,, 
p 


UCAP*EEPS 

where: 

CAR: Cumulative market adjusted returns over a 12-month period starting 
April 1 of each year; we also cumulate returns over a 15-month period, the 
starting point for which depends on the size of the firm. | 

UEPS: Unexpected earnings per share scaled by the beginning of the year market 
price per share. Expectations of earnings are based on a random walk model. 

UCAP: Unexpected capital expenditures per share scaled by the beginning of the 


year market price per share. Expectations of capital expenditures are based 
on a random walk model. 

UEPS*GTH: UEPS interacted with the proxy for growth. 

UEPS*RSK: | UEPS interacted with the proxy for risk. 

UCAP*GTH:  UCAP interacted with the proxy for growth. 

UCAP*RSK: UCAP interacted with the proxy for risk. 

UCAP*EEPS: UCAP interacted with the proxy for expected EPS. 


Our basic model, model 1, tests whether unexpected capital expenditures provide value 
relevant information about future earnings growth opportunities in the presence of earnings 
surprises. In models 2 - 4, we consider the sensitivity of the results in model 1 to the inclusion of 
possible omitted variables. We add variables in model 2, for example, to mediate the effect of 
growth and risk on the valuation of earnings. Model 3 contains variables which mediate the effect 
of cross-sectional differences in the valuation of capital expenditures. Finally, model 4 includes 
all of the variables. 

We estimate all models using both a pooled regression and a separately run regression 
approach each year during the period from 1976-1989. To pool cross-sectional and time-series 
data, we use ordinary least squares and assume that the coefficients are constant each year, except 


10 Our CAR windows are the ones found by Collins and Kothari (1989), after experimenting with different time periods, 
to maximize the adjusted R? of their empirical models. 
U The capital expenditures results will tend to be understated to the extent that this assumption is not true. 
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TABLE 2 
Descriptive Statistics of the 153 Manufacturing Firms Studied over 
the Period 1976-1989 
Variable* Mean Median Standard Deviation 
CAR 0.018 0.023 0.259 
UEPS ! 0.011 0.011 0.105 
UCAP 0.008 0.007 0.080 
MVBV 1.570 1.300 0.950 
BETA 1.100 1.080 0.330 
ROE 0.120 0.140 0.190 


* The variable definitions are: 


CAR: Cumulative market adjusted returns over a 12-month period starting April 1 of each year. 

UEPS: Unexpected earnings per share scaled by the beginning of the year market price per share. Expectations of 
earnings per share are estimated each year from a random walk model. 

UCAP: Unexpected capital expenditures per share scaled by the beginning of the year market price per share. 
Expectations of capital expenditures per share are estimated each year from a random walk model. 

MVBV: To mediate for the valuation effect of the firm’s growth, each year firms are assigned to the high growth 
portfolio (1) if their market value to book value ratio at the beginning of the year (MVBV) is greater than that 
of the midpoint sample observation in that year; otherwise, the firm is assigned to the low growth portfolio (0). 

BETA: To mediate for the valuation effect of the firm's risk, each year firms are assigned to the high risk portfolio (1) 
if their BETA at the beginning of the year is greater than that of the midpoint sample observation in that year; 
otherwise, the firm is assigned to the low risk portfolio (0). 

ROE: To mediate for the valuation effect of the firm's expected levels of earnings, each year firms are assigned to 
the high expected earnings portfolio (1) if their previous year’s return on equity (ROE), which is the previous 
ycar's carnings per share scaled by the beginning of tbe current year' s book value of equity per share, is greater 
than that of the midpoint sample observation; otherwise, the firm is assigned to the low expected earnings 
portfolio (0). 


for intercept shifts (e.g., Collins and Kothari 1989) estimated by using dummy variables for every 
year but the last year. We also pool our data using the Fuller-Batesse error components model 
discussed in SAS Supplemental Users Guide (also, see Kmenta 1986, 626, footnote 11). The 
results, however, are not sensitive to the pooling method used. 

Because pooling is inappropriate if there are shifts in the cross-sectional parameters overtime 
(e.g., from changes in the tax laws, new technology, changing international competition) or if the 
error terms are autocorrelated, we also run separate annual regressions. We use a t-test to evaluate 
whether the average coefficient over the 14. years is different from zero (e.g., see Fama and 
MacBeth 1973):? 


2 The t-test assumes an underlying normal distribution. If the underlying distributions of our coefficients are "fat-tailed" 
as they were in the study of Fama and MacBeth (1973), then the probability of a Type II error from the t-test increases 
(e.g., see Hildebrand 1986, 375). Our economic findings could be understated if we falsely accept the null hypotheses 
that our coefficients are not different from zero. We have no reason, a priori, to believe that our coefficients are "fat- 
tailed." 
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TABLE 3 


Correlation Matrix for the Independent Variables Used in the Regressions for the 153 


UEPS UCAP | UEPS*GTH | UEPS*RSK 


Manufacturing Firms Studied over the Period 1976—1989 


UCAP*GTH | UCAP*RSK UCAP*EEPS 





UEPS IT EE 68*** 05"* .08*** .O4** 
UCAP 06*** .Q9*** 36*** TANEN Ag 
UEPS*GTH 21€* .18** .04 .04 
UEPS*RSK .04 ,12** 105" 
UCAP*GTH .29** AG*** 
UCAP*RSK QS 
UEPS: Unexpected earnings per share scaled by the beginning of the year market price. Expectations of earnings 
per share are estimated each year from a random walk model. 

UCAP: Unexpected capital expenditures per share scaled by the beginning of the year market price. Annual 
expectations are estimated from a random walk model. 

UEPS*GTH: Unexpected earnings per share interacted with growth divided by the beginning of the year market price. 
Each year firms are assigned to the high growth portfolio (1) if their market value to book value ratio at 
the beginning of the year is greater than that of the midpoint sample observation in that year; otherwise, 
the firm is assigned to the low growth portfolio (0). 

UEPS*RSK: Unexpected earnings per share interacted with risk scaled by beginning of the year market price. Each 
ycar firms are assigned to the high risk portfolio (1) if their beta at the beginning of the year is greater than 
that of the midpoint sample observation in that year; otherwise, the firm is assigned to the low risk 
portfolio (0). 

UCAP*GTH: Unexpected capital expenditures per share interacted with growth scaled by the beginning of the year 
market price. Each year firms are assigned to the high growth portfolio (1) if their market value to book 
value ratio at the beginning of the year is greater than that af the midpoint sample observation in that year; 
otherwise, the firm is assigned to the low growth portfolio (0). 

UCAP*RSK: Unexpected capital expenditures per share interacted with risk scaled by the beginning of the year market 
price. Each year firms are assigned to the high risk portfolio (1) if their beta at the beginning of the year 
is greater than that of the midpoint sample observation in that year; otherwise, the firm is assigned to the 
low risk portfolio (0). 

UCAP*EEPS: Unexpected capital expenditures per share interacted with expected earnings per share scaled by the 
beginning of the year market price. Each year firms are assigned to the high expected earnings portfolio 
(1) if their previous year's earnings per share (scaled by the beginning of the current year's book value 
of equity per share) is greater than that of the midpoint sample observation; otherwise, the firm is assigned 
to the low expected earnings portfolio (0). 

uda statistically significant at least at the .01 level 

** statistically significant at least at the .05 level 

t(a,) = 2 
U^ o(a/4N 
where: a, = the regression coefficient for variable i 


the average regression coefficient a, over the 14 years 
the standard deviation of coefficient a, over the 14 years 
14 (the years in the empirical analysis of the regression coefficient). 
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IV. RESULTS 


Tables 4 and 5 present our empirical findings. Table 4 is based on pooled ordinary least 
squares regression which provides standard measures for comparison (i.e, adjusted R’s and F- 
statistics), while table 5 contains the summaries of individual year regression results. In panel A, 
we do not control for predisclosure information differences but do so in panel B. The results for 
tables 4 and 5 are similar with any significant differences noted below. 


No Control for Information Predisclosure Differences 


In panel A, the model 1 coefficient on unexpected earnings per share (UEPS) is, as expected, 
positive and statistically significant, with the coefficient on unexpected capital expenditures 
(UCAP) being close to zero. The coefficient on UCAP is negative in seven out of the 14 individual 
year’s regressions (not shown). In model 2, we add variables to control for cross-sectional 
valuation differences in earnings, but the coefficient of unexpected capital expenditures remains 
unchanged. The coefficient on the interaction term between unexpected earnings and growth 
(UEPS*GTH) has the predicted positive sign and is statistically significant. The coefficient on 
the interaction term between unexpected earnings and risk (UEPS*RSK) has the predicted 
negative sign and is statistically significant in table 4. The insignificant coefficient of UCAP in 
models 1 and 2 could be due to multicollinearity problems that are caused by the positive 
association between current earnings and/or capital expenditures with signals about future 
earnings. We do not find that to be the case, however, as evidenced by low variance inflation 
factors and by small correlation coefficients shown in table 3.? Our insignificant results in models 
1 and 2 are consistent with earlier work.“ 

In model 3, we add variables which control for the cross-sectional valuation differences in 
capital expenditures.. The coefficient on UCAP has the predicted positive sign and is now 
marginally significant in table 4. The UCAP*GTH coefficient is positive in model 3 as expected 
but is not statistically significant in either table. The coefficient on the interaction variable 
between unexpected capital expenditures and risk (UCAP*RSK) has the expected negative sign 
and is statistically significant in table 4. The coefficient on the interaction variable between 
unexpected capital expenditures and levels of earnings (UCAP*EEPS) is close to zero in both 
tables. Model 4 includes all the variables, as we add UEPS*GTH and UEPS*RSK to model 3. The 
coefficient on UCAP declines slightly in table 4 and increases in table 5 from .05 to .11 (but is 
still not statistically significant). The coefficient on UCAP*GTH becomes negative (although 
statistically insignificant) in both tables. The coefficient on UCAP*RSK is statistically signifi- 
cant in table 4. Thus, firms with riskier investments have a lower capital expenditures response 
coefficient. The other coefficients generally remain unchanged. 


Control for Information Predisclosure Differences 


We test whether predisclosure information on capital expenditures, like that of earnings, is 
sensitive to the firm's size. In both tables 4 and 5, the models in panel B control for predisclosure 


Pn table 3, the correlation between unexpected earnings (UEPS) and unexpected capital expenditures (UCAP) is only 
about .11 which, though statistically significant, appears not to be economically significant. 

^ Our results are consistent with those shown by Livnat and Zarowin (1990). In their excess returns equation 1, table 2, 
where they disaggregate their cash flows, the coefficient on investments in property, plant, and equipment has a small, 
positive, and statistically insignificant coefficient. The anthors do not find multicollinearity problems to be the 
explanation for this low coefficient, either. Similarly, in their equation 3, the authors do not find a statistically significant 
positive coefficient on total investments (which includes investments in property, plant and equipment). 
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TABLE 4 
Pooled Regressions of Excess Returns on Unexpected Earnings, Unexpected Capital 
Expenditures, and Variables which Mediate for Cross-Sectional Valuation Differences *^ 
(t values are under the coefficients) 
Model 1: CAR, €= a, + a, UEPS, + a, UCAP, + € „ where firm i = 1,2,3.....,153 
Model 2: CAR, =a, +a, UEPS, + a, UCAP, + a, UEPS*GTH, + a, UEPS*RSK, + €, 
Model 3: CAR, =a, +a, UEPS, + a, UCAP, + a, UCAP*GTH, + a, UCAP*RSK, + a, UCAP*EEPS, + €, 
Model 4: CAR, =a, +a, UEPS, + a, UCAP, + a, UEPS*GTH, + a, UEPS*RSK, + a, UCAP*GTH, 
+a, UCAP*RSK + a, UCAP*EEPS, + €, 


Coeff a, a, a, a, a, a, a, a, F-Stat Adj R 
E(Sgn) (+) (+) (+) ©) (+) (~) (+) 

Panel A. Does Not Control for Size-Related Differences in the Lead-Lag Structure of Returns — 

Model -0.09"* 0.79°" 0.02 2214 .1290 
1 -467 15.23 0.32 

Model -0.00"* 0.807 0.02 0.727 -0207 21.12 .1378 
2 -482 1151 007 455 -199 

Model -0.09 0.78" 0.15" 0.117 -0.23° -005 1865 .1292 
3  -466 1521 134 081 -L69 -031 

Model -0.09°° 0.80" 0.14 0.72%" -018 -0.01 -0.19° -0.01 1805 .1374 


4 —4.80 11.34 1.24 4.46 -1.80 -0.03 -1.43 -0.02 
Panel B. Controls for Size—Related Differences in the Lead-Lag Structure of Returns 


Model -0.127 1.01°° 0.42" | 29.38 .1659 
1 -5.45 17.35 5.36 

Model -0.12°" 1.01" 042"' 0.78" -0.17° 21.36 .1731 
2 -5.59 12.79 5.36 437  -1.51 

Model -0.12°" 1.01 0.42" 0.46" -0.22" 0.34" 2531 .1697 
3 -5.36 17.39 3.32 198 -1.47 1.76 | 


Model -0.13° 1.01°° 041° 0.737 -0.15 029 -0.19" 0.39" 23.82 .1758 
4 -5.68 1272 3.25 4.05 -1.34 121  -1.27 2.02 


* [n panel A, N = 2142 firm years, which represents 153 manufacturing companies pooled over the period 1976-1989. 
In panel B, N = 2141 firm years, which represents 152 manufacturing firms in 1976 (we lose one observation from 1975 
because CAR is computed starting in year t-1) and 153 manufacturing companies studied over the period 1977—1989. 

> Tocapture factors which affect the intercept each year, our pooled regression model includes dummy variables for every 
year from 1976 to 1988 (we exclude 1989). We do not include the individual year dummy coefficients in our tables. 
The dummy coefficients are positive and statistically significant at the .05 level every year except 1981 and 1984 (which 
are positive but statistically insignificant). 

* The variable definitions are the same as in table 3. In panel A, CAR is the cumulative market adjusted returns over a 
12-month period starting on April 1 of each year. In panel B the accumulation window for CAR varies according to 
the size of the firm. We divide our sample into three equal sized groups based on beginning of the year equity market 
values: large, median, and small firms. The CAR for the large and medium sized firms is based on the cumulative excess 
returns over 15 months starting on August 1 of year t-1; the CAR for the small firms is based on the cumulative excess 
returns over 15 months starting on November 1 of year t-i. 

*** statistically significant at least at the .01 level for a one-tailed test (except for the intercept) 

** statistically significant at least at the .05 level for a one-tailed test 

* statistically significant at least at the .10 level for a one-tailed test 
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TABLE 5 


Separate Year Regressions of Excess Returns on Unexpected Earnings, Unexpected 
Capital Expenditures, and Variables which Mediate for Cross-Sectional 
Valuation Differences *^ 
(t values are under the coefficients) 


Model 1: CAR, * =a, + a, UEPS, + a, UCAP, + €, where firm i = 1,2,3.....,153 

Model 2: CAR, =a, +a, UEPS, + a, UCAP, + a, UEPS*GTH, + a, UEPS*RSK, + €, 

Model 3: CAR, =a, +a, UEPS, + a, UCAP, + a, UCAP*GTH, + a, UCAP*RSK, + a, UCAP*E(EPS), + e, 

Model 4: CAR, = a, + a, UEPS, + a, UCAP, + a, UEPS*GTH, + a, UEPS*RSK, + a, UCAP*GTH, + a, 
UCAP*RSK, + a, UCAP*E(EPS), +€, 





Coeff ay a, a, a, a, 8 a, F-Stat  AdjR! 

E(Sgn) (*) (*) (*) (-) (+) (-) (*) 

Panel A. Does Not Control for Size-Related Differences in the Lead-Lag Structure of Returns 

Model 0.001 1.02" 0.03 12.95 .1281 
1 0.04 5.98 0.26 

Model 0.001 091" 0.02 1.40"* -0.06 9.54 .1685 
2 0.01 1.37 0.19 2.73 -0.30 

Model -0.001 1.03"* 0.05 015 -0.20 -0.01 6.11  .1357 
3 -0.01 6.04 0.33 0.52  Á-1.00 -0.04 

Model 0.001 090" 0.11 1.55" -0.02 -0.18 -0.22 0.01 6.00 .1743 
4 0.02 6.79 0.76 3.02 -0.08 -0.66 -131 0.02 

Panel B. Controls for Size-Related Differences in the Lead—Lag Structure of Returns 

Model 0.01 1.34" 0,52 18.05 .1764 
1 0.53 8.60 5.60 

Model 0.004 1.127 0.48" 1.92" 020 11.72  .2150 
2 0.25 6.76 4.89 397 0.98 

Model 0.004 135"* 0.38" 043 -0.02 0.17 802 .1829 
3 0.31 8.50 2.59 1.29 -0.11 0.80 

Model 0.002 111" 690437" 1.89" 026  -001 -0.08 0.26 7.21 .2186 
4 0.16 6.59 3.02 3.87 127  -0.03 -0.49 1.12 


* Regression models are done for each year from 1976 to 1989. The sample in table 5 is slightly lower than in table 4 


because we delete observations with Cook's D values of at least 1. The pooled regressions in table 4 do not have any 


observations with Cook's D values of one or more. 


> The reported coefficients, F-statistics, and adjusted R?s are the averages of the individual year's estimates. The t- 


statistics are based on the Fama and MacBeth t-test (1973). 
° The variable definitions are identical to those in table 4. 
*** statistically significant at least at the .01 level for a one-tailed test 
** statistically significant at least at the .05 level for a one-tailed test 
* statistically significant at least at the .10 level for a one-tailed test 
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information differences by adjusting the lead-lag structure of returns according to the size of 
the firm. 

In panel B of both tables, the coefficient on unexpected capital expenditures (UCAP) 
becomes positive and statistically significant (at the .01 level) in all of the models. In model 1 of 
table 5, for example, the coefficient goes from .03 in panel A to .52 in panel B. The capital 
expenditures response coefficient is positive in each ofthe 14 separate year regressions compared 
to being positive in only seven out of 14 years in panel A (not shown). Similarly, in model 1 of 
table 4, the coefficient on UCAP becomes much larger, going from .02 in panel A to .42 in panel 
B. The coefficients on the capital expenditures mediating variables generally have the predicted 
signs in models 3 and 4 of both tables. As expected, they are statistically significant in models 3 
and 4 of table 4.5 

Our primary sensitivity analysis involves testing whether liquidity plays an important role in 
the firm's capital expenditures decisions. We include a variable in all of our models which 
measures the firm's unexpected operating cash flows. The view is that unexpected increases 
(decreases) in operating cash flows might encourage managers to increase (decrease) their 
investments. The results remain unchanged. We also include earnings levels variables as previous 
work has done (e.g., Lev and Thiagarajan 1993) and find that the capital expenditures coefficients 
remain strongly significant with the same signs. We employ the Durbin-Watson test to determine 
whether the error terms in the pooled regressions in table 4 are autocorrelated. The error terms are 
shown to be virtually uncorrelated. 


V. SUMMARY AND CONCLUSIONS 


Many models in accounting and finance predict that investment information has a significant, 
positive relation to stock returns. Empirical research, however, has not provided consistent and 
strong evidence of this relationship. After controlling for size-related predisclosure information 
differences, we provide strong evidence that unexpected capital expenditures, in conjunction with 
mediating variables for growth, risk and earnings levels, provide incremental value relevant 
information beyond unexpected current earnings and its mediating variables. 

Given the widespread use of financial statement information besides earnings in fundamental 
analysis, further research is needed to understand the relation between non-earnings variables, 
" earnings, and firm value. Our work demonstrates the need in future studies to control for cross- 
sectional valuation differences in investments such as capital expenditures. 


D The statistical significance of our results in table 5 could be understated as noted in footnote 12. 
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APPENDIX 
Description Data Item Sources 
Book value of equity #60 Compustat 
Market value of equity #24 x #25 Compustat 
Monthly market returns Equal weighted CRSP 
Earnings per share #58 Compustat 
Adjustment factor #27 Compustat 
Capital expenditures #128 Compustat 
Market closing price #24 Compustat 
Common stock outstanding #25 Compustat 
Working capital 
from operations #110 Compustat 
Cash from changes in 
noncash current accounts A#5 —A(#4 — #1) Compustat 
Cash flows from 
operations’® #110 + ACurrent Accounts Compustat 


‘6 Cash flow from operations (#308) is available in 1987 for some firms and is generally available afterwards for all firms. 
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L. TODD JOHNSON, Future Events: A Conceptual Study of Their Significance for Recognition 
and Measurement (Norwalk: Financial Accounting Standards Board, 1994, pp. viii, 56. 


Cases are increasingly used at both the undergraduate and graduate level as an important vehicle for 
developing critical analytical skills and introducing a bitof realism into the accounting curriculum. A variety 
of types of cases can be found. Some focus on financial reporting research skills. Others stress manipulation 
and interpretation of excerpts from financial reports. However, very few cases are designed to support 
courses, or segments of courses, that address the conceptual foundations and related policy-making issues 
of the current financial reporting system. If you have lamented the absence of cases that assist the student 
in understanding the role and importance of an abstract conceptual structure for financial reporting, I 
strongly recommend that you consider this monograph. It concludes a series of cases incorporating specific, 
concrete future events scenarios that can provide an intellectually stimulating basis for classroom discussion 
of existence, recognition and measurement issues. 

The monograph was not prepared for classroom use. Át the 1993 Conference of Standard-Setting 
Bodies in London, 45 participants from 18 countries and three international organizations discussed the 
relationship of future events to decisions regarding the recognition and measurement of assets and liabilities. 
This monograph was prepared by a working group composed of 12 board members and senior staff from 
the standard setting bodies of Australia, Canada, the United Kingdom, and the United States, as well as the 
Secretary General of the International Accounting Standards Committee (LASC), to facilitate discussion at 
the conference. 

The monograph presents an informative analysis of the significance of future events in determining 
financial reporting standards. The body of the report includes a brief (16 page) discussion of the recognition 
and measurement issues associated with future events, including short summaries of the working group's 
views. Many interesting relationships and views are introduced, including such things as the role of 
t intent, alternative ways of perceiving “past events," alternative ways of interpreting the notion 

of "probable," and alternative ways of assessing probabilities in the context of an accounting recognition 
or measurement issue. This analysis is supported by two very interesting and useful appendices. One 
appendix presents a comparative summary of the current positions of the standard-setting bodies repre- 
sented on the working group on five fundamental concepts: assets, liabilities, recognition, criteria, and 
measurement. Although each of the five tables is brief, they are quite informative. The alternative 
perspectives on the concepts are likely to be a source of some of the differences in the financial reporting 
standards of the four countries and the IASC. The second appendix (30 pages) presents 12 cases which were 
designed by the working group to facilitate and stimulate discussion and analysis of recognition and 
measurement problems involving future events. These cases, and the brief analyses accompanying each of 
them, are an important source of classroom material. The issues are framed and discussed in terms of 
fundamental conceptual elements. Does the item satisfy the relevant definition? Does it satisfy recognition 
criteria? How can it be measured? In this manner, the conceptual elements and criteria gain some measure 
- of concreteness. 
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Educators interested in financial reporting policy and theory will enjoy and benefit from reading this 
monograph. I am even more enthusiastic about the monograph’s potential in the classroom. According to 
the author, "the case-study exercise at the conference proved to be an enlightening process as conference 
participants were...able to discuss the issues and their views on those issues largely in the context of the 
IASC’s conceptual framework" (p. v). [believe that the case studies can play the same role at our universities. 
I encourage you to consider this approach. 

THOMAS H. WILLIAMS 
Professor of Accounting 
University of Wisconsin-Madison 


ROBERT VAN RIPER, Setting Standards for Financial Reporting: FASB and the Struggle for 
Control of a Critical Process (Westport, CT: Quorum Books, 1994, pp. vi, 206) 


The supporters of neutrality in the standard-setting process for financial reporting will find an eloquent 
advocate of their cause in Robert Van Riper. Those who prefer to envisage accounting standards as those 
that successfully emerge from political negotiation will not be amused by his argument. 

From 1973 to 1991, Van Riper was public relations counsel to the Financial Accounting Standards 
Board, and he witnessed the drama of the “Struggle for Control” of the standard-setting process from the 
inside. His is clearly a view with which there would have been great sympathy within the Board. The hero 
of the piece is Donald J. Kirk, a charter member of the Board and its chairman from 1978 to 1986. A strong 
supporting actor is Oscar Gellein, whose Board term was only from 1975 to 1978. Clearly, Van Riper 
admires them both. The villain is the small segment of Corporate America that has, through The Business 
Roundtable and the Financial Executives Institute, increasingly sought to constrain the standard-setting 
process so as to serve its parochial interests. A supporting actor in this latter cause was a sufficient majority 
of the board of trustees of the Financial Accounting Foundation (the Board’s parent) who did not effectively 
defend the Board from this unprincipled intervention. 

Van Riper’s book is well written and was scrupulously researched. The contents can be appreciated by 
a large audience, and the book would be provocative reading in a course that undertakes to prepare students 
for the rough-and tumble real world of accounting. 

Van Riper begins with some background that helps one understand why the FASB, the first full-time 
accounting standard setter anywhere, was installed in 1973. In chapters 3 and 4, he discusses how the FASB 
handled the pressure brought by bankers in 1975-76 over its proposed accounting by creditors for the 
restructuring of troubled debt (Statement of Financial Accounting Standards No. 15), the criticisms 

emanating from the Moss and Metcalf investigations of 1976—77, and the intense and insistent lobbying in 
1976—77 by the petroleum industry over the attempt to eliminate full costing (SFAS 19). He calls the last 
of these episodes "the first full-blown controversy about the economic and social consequences of an 
accounting standard—and probably the most intensely politicized accounting argument ever" (p. 56). Now 
that the Board has announced its intention not to require income-statement recognition of the compensation 
expense implicit in stock options, following fierce pressure from Congressional quarters, Van Riper might 
revise that opinion. 

But the drama intensifies when, in the early 1980s, Roger Smith (of "Roger and Me"), General Motors' 
CEO and, as Van Riper enjoys pointing out, a member of the Wheat Study Group that was the architect of 
the FASB, tool control of The Business Roundtable's Accounting Principles Task Force. Under the 
leadership of Smith and his successor, John S. Reed of Citicorp, the Task Force has pressed relentlessly for 
changes in the way in which the Board conducts its affairs. Chapter 7 through 10 are the centerpiece of the 
book, in which Van Riper traces the series of Task Force initiatives, sometimes executed in the name of FEI, 
to hamstring the FASB so as to promote the favored economic and social consequences of immediate interest 
to *a tiny handful of powerful men who were in a position at least to appear to speak for all American 
business" (p. 142). Even the Securities and Exchange Commission, he finds, has been unable to resist the 
"siren call" of economic and social consequences. The saddest part of the story deals with the failure of the 
majority of FAF trustees to defend the Board from these influences. 

In chapters 11 and 12, Van Riper discusses and illustrates the “perverse factors" in the financial 
reporting environment that "brought down the predecessors of the present standard-setting arrangement and 
are showing signs of eventually negating the ‘bold experiment” of the FASB itself (p. 176). He predicts a 
gloomy future for private-sector standard setting, particularly if those who speak for Corporate America do 
not come to realize the quintessential importance of neutrality in the standard-setting process. 
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Van Riper's book contains the fullest catalogue yet seen in print of the efforts over the years to frustrate 
the FASB in fulfilling its mandate. But the book is also a useful guided tour, again from the inside, of the 
Board's operations during its first 20 years. It is like reading all of the Board's Status Reports and annual 
reports with someone at your side explaining why everything happened. 

Itis symptomatic of the state of the accounting "profession" that, in Van Riper's account, the American 
Institute of Certified Public Accountants as well as the Big 8/6 firms and their senior partners are largely seen 
as standing on the sidelines during the great struggle of the 1980s and early 1990s between the Task Force 
and the FASB. He quotes approvingly a view expressed in 1991 by former Board member Arthur R. Wyatt: 
“Many attesters seem to have lost their ability and/or willingness to incur potential disfavor with one or more 
of their clients by taking a position on controversial issue" (p. 184). Indeed, Van Riper maintains that the 
intensification of the competitive market for the services of the major public accounting firms in the 1980s 
"gave rise to a strange kind of cautious antagonism toward the standard setters" (p. 162). 

My only disappointment with the book is editorial. The index is mainly one of names, and most of the 
subjects discussed at some length in the book (e.g., stock options, accounting for changing prices, pension 
accounting, other post-retirement benefits) are not given entries. Moreover, there are no sub-headings in the 
chapters, compounding the difficulty for a reader who is trying to find the coverage of particular subjects. 

STEPHEN A. ZEFF 
Herbert S. Autrey Professor of Accounting 
Rice University 
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ABSTRACT: The purpose of this study Is to determine the prevalence, magnitude 
and timing of retiree health care benefit reductions and to identify determinants of the 
benefit-reduction decisions. Three explanations for these benefit reductions are 
examined: (1) Increased contracting cost caused by the financial reporting conse- 
quences of SFAS No. 106, (2) financlal weakness independent of SFAS No. 106 and 
(3) firm-specific increases in retiree health care costs. Strong support for the 
increased contracting cost hypothesis is found after controlling for industry, financial 
weakness and firm-specific changes In retiree health care costs. However, the 
results also indicate that firms cutting benefits are financially weaker and have higher 
retiree health care costs at the time benefits are reduced. Therefore, SFAS No. 106 
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I. INTRODUCTION 


T his study examines possible determinants of firms’ decisions to reduce benefits of retiree 
health care plans. The reductions are accomplished through plan amendments that limit 
employer payments, increase retiree copayments, tie benefits to years of service or 
` completely eliminate health care coverage. The objective of this study is to determine the 
prevalence, magnitude and timing of the benefit reductions and to identify cross-sectional and 
time-varying factors associated with decisions to cut employer-sponsored retiree health benefits. 

This study extends two lines of accounting research: economic consequences of mandated 
accounting changes and the nature of decisions to modify retiree benefit contracts. Whereas much 
of the economic consequences literature focuses on security price reaction to a mandated 
accounting change (Watts and Zimmerman 1990, 138), few studies examine management actions 
in response to achange (as in Imhoff and Thomas 1988). We examine the link between the passage 
of Statement of Financial Accounting Standards No. 106: Employers’ Accounting for 
Postretirement Benefits Other than Pensions (SFAS No. 106) and benefit reductions in retiree 
health care contracts. In conducting the research, we attempt to address problems inherent in 
examining the effects of a mandated accounting change (see Ball 1980). 

Because of the parallels with pension contracts, this study also supplements research 
examining management decisions to remove excess assets from pension plans (see for example, 
Mittelstaedt 1989; Thomas 1989; Mittelstaedt and Regier 1993). The pension research found a 
high correlation between financial weakness and the decision to recapture the pension plan 
overfunding. This study ascertains whether a similar pattern extends to reductions in unfunded 
retiree health care benefits. 

We examine three potential explanations for why firms reduce postretirement health care 
coverage: (1) increased contracting cost caused by the financial reporting consequences of SFAS 
No. 106; (2) financial weakness independent of SFAS No. 106; and (3) firm-specific increases 
in retiree health care costs. The influence of SFAS No. 106 and other factors on retiree health care 
reduction decisions is of special interest because of the controversy surrounding the passage of 
SFAS No. 106. We find that 89 percent of health care benefit reductions are made within one year 
of SFAS No. 106 adoption and the effect of SFAS No. 106 on leverage is more negative for firms 
that reduced benefits than for other firms. The impact of SFAS No. 106 on benefit reductions holds 
after controlling for industry membership, financial condition and changes in retiree health care 
costs. Results also indicate that firms cutting health care benefits are financially weaker than no- 
cut firms, but the benefit-cut firms exhibit less financial weakness than observed in prior research 
examining firms that reduced pension plan overfunding. The results for the increase in retiree 
health care costs explanation are mixed. Thus, SFAS No. 106 can be viewed as an important, but 
not the sole, motive for the health care benefit reductions. This result implies that debt or labor 
contracts would have been written differently had SFAS No. 106 always been in effect. 

The remainder of this paper is organized as follows. In section II, we discuss institutional 
features of retiree health plans, requirements of SFAS No. 106 and prior research. In section III, 
the alternative hypotheses are explained. Section IV describes the research design and presents 
descriptive data. In section V, we present the empirical results. Conclusions are presented in 
section VI. 


II. BACKGROUND 


The General Accounting Office (GAO) estimates that about one-third of all U.S. private 
sector employees are employed by firms that provide free or highly subsidized health benefits to 
retired employees and their dependents (GAO 1990, 6), making such plans an important 
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component of retiree health care coverage.' Despite their importance to retirees, the benefits are 
increasingly at risk: Many firms have reduced coverage they offer current or future retirees or 
both, while employees at many other firms believe their health care plans are targeted for future 
reductions (Goeppinger and Dobbelaere 1993). The passage of SFAS No. 106 and rapidly rising 
health care costs are generally cited as the principal motivating factors underlying the numerous 
plan amendments and curtailments reducing retiree health care benefits (see Grant 1993; Mazo 
1993). These and other factors are discussed in more detail below, following a brief history of 
retiree health care benefit accounting and relevant prior research. 


Retiree Health Care Benefit Accounting 


Although retiree health plans have existed since the 1950s (Warshawsky 1992), accounting 
disclosures concerning retiree health care costs were not required until the mid-1980s. In 1984, 
Statement of Financial Accounting Standards No. 81: Disclosure of Postretirement Health Care 
and Life Insurance Benefits (SFAS No. 81) required employers to disclose, in their annual 
financial statements, information about the benefits provided, employee groups covered, ac- 
counting and funding policies and cost recognized for each accounting period that income is 
reported. In the vast majority of cases, cost equaled the annual cash outlay for retiree health 
benefits (hereafter, pay-as-you-go cost).? 

SFAS No. 106 requires firms to change to an accrual basis of accounting that parallels 
Statement of Financial Accounting Standards No. 87: Employers’ Accounting for Pensions 
(SFAS No. 87)? Whereas most pension plans were overfunded upon adoption of SFAS No. 87, 
most firms had not prefunded any of their postretirement health care obligations due to the lack 
of tax incentives and legal requirements. Estimates of this unfunded liability for all firms range 
from $221 billion to $332 billion (Mittelstaedt and Warshawsky 1993, 17). Warshawsky et al. 
(1993, 195) estimate that the median retiree health care liability per firm (net-of-tax) is $46 
million or approximately six percent of a firm's market value of common equity.‘ 

Upon transition to SFAS No. 106, firms could elect immediate recognition (as a cumulative 
effect of an accounting change) or delayed recognition (as amortization over a period not to 
exceed 20 years) of the existing unfunded retiree health care benefit liability (transition liability). 
In either case, the total liability is disclosed in financial statement notes. As with pensions, service 
cost, interest cost, actual return and amortization and deferral components of net other postretire- 
ment benefits cost/expense also must be disclosed. Warshawsky et al. (1993, 195 and 196) 
estimate that SFAS No. 106 decreases income in the years after adoption by approximately five 
percent if the transition liability is recognized immediately and by approximately eight percent 


! Although only about four percent of al! companies provide retiree health coverage, the GAO estimated in 1990 that 43 
percent of companies with over 500 workers provided retiree health coverage (GAO 1990, 4). At that time, companies 
with retiree health coverage employed 38 million workers who could receive health benefits upon retirement and 
provided health benefits to over 5 million existing retirees. Approximately 40 percent of the retirees were under age 65. 

2 Warshawsky (1992, 110) reports that only 49 of 676 publicly-held firms with health plans used some form of accrual 
accounting in 1988. Only 19 prefunded benefits. Based on our examination of footnote disclosures, most firms accruing 
health care benefits prior to adopting SFAS No. 106 were using methods not acceptable under SFAS No. 106 (e.g., 
accruing all retiree health care cost at 

* See Warshawsky et al. (1993) for a more detailed summary of the SFAS No. 106 provisions. 

* The Warshawsky et al. (1993) statistics are based on a 476-firm sample for which 1988 pay-as-you-go cost and 
Compustat data were available. Their sample is a subset of the Warshawsky (1992) sample described more thoroughly 
in the sample selection portion of this study. 
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if the delayed recognition approach is elected. In addition, if immediate recognition is elected, the 
cumulative effect of the accounting change is estimated to cause a 73 percent median decrease 
in income and a five percent median increase in balance sheet debt in the year SFAS No. 106 is 
adopted. l 

Although SFAS No. 106 was passed in December 1990, mandatory adoption was delayed to: 
fiscal years beginning after December 15, 1992. However, Securities and Exchange Commission 
(SEC) Staff Accounting Bulletin No. 74 (SAB No. 74) required firms to discuss, each year prior 
to adoption, the impact of SFAS No. 106 in the Management Discussion and Analysis section of 
their 10-Ks. Few liability esttmates were provided in 1990 financial statements, but most firms 
gave point or range estimates in 1991. For our sample, 60 percent adopted SFAS No. 106 prior 
to 1993 ("early adopters”). 


Prior Research 


Both Amir (1993) and Mittelstaedt and Warshawsky (1993) present evidence consistent with 
the security market using firms’ pay-as-you-go cost disclosures to make retiree health care 
liability estimates in the years prior to the adoption of SFAS No. 106. Mittelstaedt arid 
Warshawsky’s (1993, 30) results suggest that the impact of retiree health care liabilities on share 
price is less than that of other balance sheet liabilities. They state that although this finding may 
be due in part to measurement error associated with their estimate of the retiree health care liability 
variable, it is consistent with the market anticipating firm-specific or government actions to 
reduce future health care payouts. These findings raise the issue of why managers reduce benefits 
as a result of SEAS No. 106 if security prices fully reflect retiree health care liabilities prior to its 
adoption. ; 

Espahbodi et al. (1991) provide limited evidence related to this issue. They found security 
prices declined following issuance of the Exposure Draft on accounting for nonpension 
postretirement benefits (which was issued with modification as SFAS No. 106) for firms with few 
current retirees relative to active employees, firms with high leverage ratios, small firms as 
measured by market value of equity and firms consistently using the pay-as-you-go method to 
account for postretirement benefits other than pensions (1991, table 5, Event #8). The authors 
interpret this result as being consistent with the Exposure Draft having an indirect effect on cash 
flows by reducing firms’ optimal contracting technology and increasing the probability of 
technical default on debt. This study continues this line of research by examining the impact of 
the pending adoption of SFAS No. 106 on contracting in explaining managers’ decisions to reduce 
health care benefits after considering two additional factors: financial weakness independent of 
SFAS No. 106 and firm-specific increases in retiree health care cost. These factors are discussed 
in the next section. , 


HI. DEVELOPMENT OF HYPOTHESES 


Increased Contracting Costs Caused by SFAS No. 106 


There was a common presumption that adoption or pending adoption of SFAS No. 106 
caused or would cause firms to reduce retiree health benefits: 


Their resulting awareness of the immensity of their long-term obligations has moved 
many employers to reduce or eliminate the promise to active workers and limit the 
benefit for retirees (Atkins 1993, 2). 


In March 1993, the U.S. Senate Committee on Labor and Human Resources, Subcom- 
mittee on Labor, held hearings on the erosion of employer-sponsored retiree health 
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benefits and the impact on workers and businesses. Many of the employers that testified 
before the Committee placed the blame squarely on FAS 106, the ever-escalating cost 
of health care and the lack of national health insurance (Rappaport and Malone 1993, 39). 


Because SFAS No. 106 has no direct cash flow effects, these statements rely implicitly on 
the argument that the adoption of the pronouncement has indirect cash flow effects. One plausible 
indirect cash flow effect is the cost associated with moving a firm closer to debt-covenant 
restrictions. A technical violation of a debt covenant gives lenders the option to exercise 
contractual rights that potentially impose renegotiation, refinancing and restructuring costs on the 
borrowing firm (Beneish and Press 1993, 235). Forexample, renegotiated agreements often result 
in higher interest rates and new covenants that restrict the investing and financing activities of the 
debtor (Beneish and Press 1993, 246—248). 

' Research indicates that it is costly to violate debt covenants, and since covenants contain 
accounting-based restrictions, managers act to minimize the probability of technical default. 
Beneish and Press (1995, 343) report that their sample of 87 first-time disclosers of technical 
default experienced a highly significant mean 3.52% reduction in equity value during the 3-day 
period surrounding the default announcement. In addition, Sweeney (1994, 294) finds that during 
years —5 to +2 surrounding technical default, the cumulative effect of accounting changes made 
by 130 firms violating debt covenants are significantly more income-increasing than changes 
made by non-defaulting firms in a matched sample. 

Much of the research examining the debt-contracting hypothesis uses debt ratios as a proxy 
for the existence and tightness of accounting-based debt covenants (i.e., higher debt ratios are 
associated with covenant tightness). Both Duke and Hunt (1990) and Press and Weintrop (1990) 
provide empirical support for this proxy’s use.? Recently, Ali and Kumar (1994) presented evi- 
dence that accounting choice decisions depend jointly on the magnitude of their financial 
statement effect and firm-specific characteristics such as leverage. 

For our study, management's decision to reduce retiree health care benefits depends jointly 
on the tightness of the accounting-based covenants prior to adoption of SFAS No. 106 and the 
magnitude of the SFAS No. 106 effect on covenant tightness. Increased contracting costs could 
accompany adoption of SFAS No. 106 when existing leverage is high or the SFAS No. 106 effect 
is large or both. On average, firms that have high leverage prior to adoption of SFAS No. 106 or 
have large increases in leverage from adoption of SFAS No. 106 or both are most likely to reduce 
retiree health care benefits. 

This discussion leads to the following hypothesis: 


H1: Firms that reduce retiree health care coverage have higher leverage before considering 
the effects of SFAS No. 106 and/or experience greater increases in leverage from SFAS 
No. 106 than other firms prior to reducing coverage. : 


Financial Weakness 


Pension plan contractions and terminations were important tools of corporate finance during 
the early and mid-1980s. Several studies have explored the determinants of firms’ decisions to 
contract or terminate overfunded plans (Stone 1987; Mittelstaedt 1989; Thomas 1989).° These 


5 See Press and Weintrop (1990, table 3) for a partial list of other researchers using leverage ratios as a surrogate for 
covenant tightness. 

$ In these studies contracting firms continue defined benefits and recapture overfunding slowly through changes in 
actuarial assumptions that reduce the level of future plan contributions whereas terminating firms recapture overfunding 
immediately and may or may not continue defined benefits. 
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studies find that both contractors and terminators are financially weak prior to the contraction or 
termination decision, a finding consistent with two hypotheses. The first follows from the Myers 
and Majluf (1984) proposition that financially weak firms liquidate financial slack when 
internally generated cash flow is less than cash requirements and that these liquidations follow 
a pecking order where less costly sources of capital are obtained first. Mittelstaedt (1989) and 
Thomas (1989) argue that overfunded pension plans can be viewed as sources of financial slack 
with plan contractions and terminations representing liquidations of financial slack. The results 
of these studies indicate that firms draw on other sources of financial slack before drawing on 
overfunded pension plans, implying that plan terminations and contractions are more costly 
sources of capital (Thomas 1989, 386—388). Only financially weak firms (i.e., firms that had 
exhausted other sources of capital) draw on overfunded pension plans. The results of these studies 
are also consistent with a second hypothesis, that the value of upholding an implicit contract 
diminishes as a firm's financial health weakens. As a result, financially weak firms are expected 
to terminate plans, with the weakest firms completely eliminating defined benefit coverage 
(Mittelstaedt and Regier 1993). 

While pension plan terminations brought large amounts of cash into firms within a short 
period of time, reductions in retiree health care benefits generally lower cash outflows over a long 
period of time. As a result, the cash flow effect of the health care benefit cuts studied in this paper 
is similar to the contraction method of reducing pension plan overfunding discussed above. 
However, firms can reduce pension plan overfunding through plan contractions and terminations 
without changing the implicit or explicit pension contract and employees bear little loss (i.e., for 
all plan contractions and those terminations where the level of defined benefit coverage remains 
unchanged). This is not the case for reductions in health care benefits. When health care benefits 
are cut, the reductions in cash outflows represent a direct wealth transfer from plan participants 
to a firm's other claimholders because the benefits paid for continuing coverage are less than 
under the original implicit contract. Consequently, employees have challenged the legality of 
health care benefit cuts, but the courts have tended to rule: 


...thatthere i$ no ongoing employer obligation to provide retiree health care if the written 
plan document and summary plan description include reservation-of-rights language, at 
least in the absence of express commitments to the contrary (Mazo 1993, 7). 


Thus, in many cases the explicit contract allows firms to amend or cancel plans unilaterally. 
This discussion leads to the following hypothesis: 


H2: Firms that reduce retiree health care coverage are financially weaker than other firms 
prior to reducing coverage. 


Whereas the increased contracting cost hypothesis predicts that the effects of SFAS No. 106 
interact with financial weakness and lead to the reduction in benefits, this hypothesis suggests that 
financially weak firms would have reduced benefits without the introduction of SFAS No. 106. 


Firm-Speciflc Increases in Health Care Costs 


Economy-wide increases in retiree health care costs have resulted from increases in the health 
care cost inflation rate, demographic factors and decreases in Medicare coverage. Since the early 
1980s, there has been a continued real annual increase in health care costs of approximately four 
percent (see discussion of table 2, panel A presented in section IV). Perhaps the most important 
demographic factor is the aging of the population: Older people have higher health costs, and the 
economy-wide ratio of persons age 65 and older to working age persons (18—64) will increase 
from .20 in 1985 to .33 in 2025 (Rappaport and Malone 1993, 4). Additionally, due to advances 
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in medical technology and wellness efforts, U.S. retirees are living longer and incurring high 
medical costs in their later years. Finally, Medicare benefits have been eroding over the past 
decade, shifting higher copayments and deductibles to individuals or retiree health plans. 
Some firms may be impacted by the rising costs more than others.’ For example, firms and 
industries experiencing demographic shifts from younger to older workforces must channel more 
resources per active employee to support the promises made to those who have retired or the scope 
of the promises must be scaled back and modified. Similarly, mergers, restructurings and 
corporate downsizings over the past decade resulted in many early retirements. These early 
retirees tend to create a disproportionate increase in a firm’s retiree medical liability because 
Medicare is unavailable until age 65 and medical expenditures for early retirees are 35 to 60 
percent greater than expenditures for same-age active employees (Rappaport and Malone 1993, 
25). Finally, differences in plan generosity and firms’ ability to pass along health care costs to 
customers result in some firms absorbing more of the increase in medical costs than other firms. 
This discussion leads to the third hypothesis: 


H3: Firms that reduce retiree health care coverage experience greater firm-specific increases 
in health care costs than other firms prior to reducing coverage. 


The effects of economy-wide increases in retiree health care costs are discussed in the next 
section. 


IV. RESEARCH DESIGN AND DESCRIPTIVE STATISTICS 


The research design addresses three common problems encountered when examining the 
effects of changes in accounting policy: identification, timing and control (see Ball 1980, 32). 
Ball refers to identification as the problem that arises when changes in beliefs about the 
environment occur concurrently with (1) changes in prior accounting mules’ ability to meet 
information needs of the new environment and (2) changes in beliefs about possible new 
accounting rules. Thus, a researcher may not be able to disentangle the environmental and 
accounting effects. The timing problem arises because the environment and beliefs change 
gradually over time rather than abruptly. Finally, the control problem relates to the first two 
problems in that the environment is not stationary around the time of the accounting change. As 
a result, periods prior to the accounting change and firms not affected by the new accounting rules 
may not serve as adequate controls. 

We implement several procedures to mitigate these concerns. First, we include plausible 
factors other than the passage of SFAS No. 106 in univariate and multivariate analyses of benefit- 
cut decisions. Second, the impact of SFAS No. 106 is defined narrowly as the interaction of the 
increase in leverage arising from the pronouncement and the existing leverage prior to adoption 
of the pronouncement. These first two procedures relate primarily to the identification issue and 
are included to address Ball’s criticism of studies that attempt to attribute any change in 
management behavior to an accounting change when the research design -considers neither 
environmental features nor the specific effects of the accounting change on contracts linked to 
financial statements. Third, we examine only firms that sponsor retiree health plans and used pay- 
as-you-go accounting prior to adopting SFAS No. 106. Fourth, analyses are conducted using 
unadjusted and industry-adjusted independent variables. The third and fourth procedures relate 
primarily to control issues. We address the timing problem by examining changes in factors over 


7 Rappaport and Malone (1993) provide evidence on variation in retiree health care costs by employer. 
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a four-year time period, which in most cases, includes two years before the passage of SFAS No. 
106. In assessing the propensity to cut benefits, we examine the period beginning nine years 
before the passage of SFAS No. 106. The research design is discussed in greater detail below. 


Definition of Variables 


Firms disclosing health care benefit reductions in their financial statements or Form 10-K 
filings are classified as benefit-cut firms. We observe five types of benefit reduction which are 
defined below.* 


(1) Cap employer contributions. Firms that place fixed dollar caps on employers' total future 
contributions for retiree health care. The cap is often set at the level of contributions made 
by the firm in a prior year or that will be made in a predetermined future year, such as 1996. 

(2) Increase copayment amounts. Firms that increase copayment requirements for benefits such 
as prescription drugs, dental, vision or other medical-related benefits. Firms which switch 
from the coordination of benefits method or the medigap coverage method to the carve-out 
method in coordinating Medicare benefits are also classified in this category.? 

(3) Tie benefits to years of service. Firms that amend plans to tie medical benefits to years of 
service. Forexample, plans may stipulate that retirees receive a certain percentage of medical 
credit for each year of service, starting at age 40 up to a maximum of 20 years for 100 percent 
coverage. 

(4) Change to a defined contribution plan. Firms that change from defined benefit to defined 
contribution plans. In defined contribution plans firms fund individual accounts that can be 
used to pay for health care upon retirement. Health benefits depend on the amounts 
contributed, investment return and forfeitures from participants leaving the firm. There is no 
guaranteed health benefit upon retirement. 

(5) Eliminate benefits. Firms that eliminate health care benefits for certain classes of employees. 


In addition, some firms disclose that benefits are reduced but do not specify how benefits are 
reduced. 

For benefit-cut firms year 0 is defined as the year that the reduction in health care benefits 
is disclosed in the financial statements or Form 10-K filings; for no-cut firms it is a randomly 
assigned year such that the overall distribution mirrors the fiscal year distribution observed for 
the benefit-cut firms. The independent variables for the benefit-cut and no-cut firms are measured 
as of year —3 through year 0. 

An interaction variable is used to test the increased contracting cost hypothesis. Recall from 
section II that increased contacting cost depends jointly on the tightness of accounting-based 
covenants and the magnitude of the SFAS No. 106 effect on covenant tightness, with existing 
leverage considered a proxy for the existence and tightness of covenant restrictions. Existing 


! Several firms instituted managed care programs (such as PPOs or HMOs); we did not classify them as reducing health 
care benefits for two reasons. First, these programs have a minor impact on the expected postretirement liability for 
retirees age 65 and over. Medicare is itself a form of managed care, and the incremental cost savings arising from further 
imposition of managed care programs on retirees are marginal (Coopers and Lybrand 1992, 61). Second, these programs 
typically provide cost savings in terms of health care expenditures for active workers, but do not necessarily represent 
a decline in either the level or quality of services provided. There is no clearly defined cost-shifting from employers to 
employees ofthe type discussed below. Instead, the plans achieve cost savings through more efficiently providing health 
care benefits to active workers. 

? Sce Warshawsky (1992, Chapter 2) for a further discussion regarding the coordination of employer-provided and 
Medicare benefits. 
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leverage is defined as total debt to total assets without the effect of SFAS No. 106. The change 
in leverage resulting from SFAS No. 106 is defined as the change in total debt from SFAS No. 
106 (net-of-tax) to total assets (i.e., the SFAS No. 106 liability (net-of-tax) to total assets).!° The 
interaction variable is therefore the product of existing leverage and change in leverage resulting 
from SFAS No. 106.!! The form of the interaction variable is similar to one used by Ali and Kumar 
(1994, 98) to assess the joint impact of leverage and the income effect arising from SFAS No. 87 
on the decision to adopt SFAS No. 87 early. 

In most cases, the calculation of the SFAS No. 106 liability is based on 1991 or 1992 
disclosures. Because these disclosures are not available for the entire four-year period and given 
that the actual estimates of the liability change little from 1991 to 1992, only year O data are 
presented for this variable. For firms where both 1991 and 1992 disclosures are available, 1991 
liabilities are used because they are more likely than 1992 amounts to be estimated before any cuts 
in health care benefits (see table 2, panel À). If after-tax amounts are not disclosed, after-tax 
amounts are estimated from pretax figures using a 34 percent tax rate. If ranges instead of point 
estimates are disclosed, the midpoint of the range is used." 

Three variables are used to test the financial weakness hypothesis: total debt to total assets, 
after-tax income from continuing operations to total assets and cash flow from operations to total 
assets. These or similar variables have been used in pension studies to assess financial health (see 
for example, Francis and Reiter 1987; Mittelstaedt and Regier 1993). 

Two variables are used to test the firm-specific increase in health care cost hypothesis, retiree 
pay-as-you-go cost to sales revenue and the percentage change from the prior year in retiree pay- 
as-you-go cost to sales revenue. These variables assume a relation between managers' decisions 
to reduce health care benefits and the magnitude, or change in the magnitude, of health care costs 
relative to sales revenue. For the change in magnitude variable, this implies that firms cutting 
benefits are unable to pass along all increases in health care costs to consumers. Articles appearing 
in the financial press support this contention. For example, General Motors' executives have 
indicated that GM's industry-high health care cost per unit of sales must be reduced either by 
benefit cuts or national health-care reform if GM is to compete successfully with domestic and 


P Including an after-tax amount in the numerator of the SFAS No. 106 liability (net-of-tax) to total assets ratio assumes 
that the corresponding deferred tax asset associated with the adoption of SFAS No. 106 is completely offset by existing 
deferred tax liabilities. This is not the case for all firms in our sample. However, including a before-tax amount in the 
numerator and allocating the tax effect between the deferred tax liability and asset has very little effect on this ratio for 
such firms. 

11 Results are similar if the increased contracting cost variable is redefined as the percent change in the total debt to total 
asset ratio caused by SFAS No. 106. Minor differences in results are discussed later in the multivariate analysis section. 
We prefer the interaction variable over the percentage change variable because it is more consistent with theory and 
better captures degree of leverage prior to the adoption of SFAS No. 106. Because SFAS No. 106 does not affect assets 
in the pro forma calculations, the percentage change in debt variable reduces to the SPAS No. 106 liability (net-of-tax) 
divided by total debt prior to the effect of SFAS No. 106. 

USFAS No. 106 liability information was available for the full 202-firm sample (discussed below) in 1992 but for only 
153 firms in 1991. In 1992,75 percent of the firms gave after-tax point estimates, 16 percent gave pretax point estimates, 
three percent gave after-tax range estimates and six percent gave pretax range estimates. In 1991, for those firms 
reporting estimates, 28 percent gave after-tax point estimates, 19 percent gave pretax point estimates, 21 percent gave 
after-tax range estimates and 32 percent gave pretax range estimates. 

3 Ror the financial weakness hypothesis, univariate and multivariate tests were also performed using market value of 
common equity as the scaler and all results are qualitatively similar to those reported in tables 3 and 4. We also tested 
pension plan funding level as an additional proxy for financial weakness (see Choi et al. 1994, 25). Pension plan funding 
level does not appear to be correlated with the health care benefit-cut decision or other measures of financial weakness. 
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foreign rivals (Franklin 1993, 6). Also, the Management Discussion and Analysis section of 
Navistar International’s 1992 annual report discusses postretirement costs as a percentage of sales 
revenue. 

The majority of sample firms adopted SFAS No. 106 prior to 1993. As a result, post-adoption 
year disclosures are adjusted to make them comparable to data from pre-adoption years. The 
procedures used to adjust post-adoption disclosures are as follows: i 


1. The portion of the after-tax income effect of moving to SFAS No. 106 contained in after-tax 
income from continuing operations is added to after-tax income from continuing operations; 

2. If pay-as-you-go cost is not disclosed, it is derived by subtracting the pretax income effect 
contained in income from continuing operations from other postretirement benefit expense 
under SFAS No. 106; 

3. The pretax transition liability actually recognized (as opposed to reported in 1991) is deducted 
from firm liabilities; and 

4. Liabilities are increased and/or assets are decreased for the effects of SFAS No. 106 on deferred 
income tax liabilities or assets. 


Cross-sectional differences in the independent variables for benefit-cut and no-cut firms 
could result from differences in firm-specific or industry-specific factors or both. To control for 
industry effects, analyses are conducted using unadjusted and industry-adjusted independent 
variables. Following procedures outlined in Foster (1986, 178), we define an industry-adjusted 
variable as the unadjusted variable less the industry median with this difference divided by the 
interquartile range. 

Data relating to health care benefit reductions, pay-as-you-go costs, SFAS No. 106 liabilities 
and deferred income taxes are hand-gathered from financial statements and Form 10-K filings 
using Corporate Text and the National Automated Accounting Research System (NAARS) as 
sources.’ Other data are obtained using Compustat. Because the SEC’s SAB No. 74 required 
financial statement disclosure of the impact of SFAS No. 106 in fiscal years prior to adoption, 
these data requirements do not limit our sample to early adopters of SFAS No. 106. 


Sample Selection 


Our sample begins with the 548 firms identified in Warshawsky (1992, 110) as disclosing 
pay-as-you-go costs in 1988 through word searches of the November 11, 1990 version of 
Corporate Text. This version contains the financial data of 2,215 firms that file with the SEC and 
that are listed on either the New York Stock Exchange or American Stock Exchange. Thirty- 
one percent of these firms (676 firms) disclosed that they provided health care benefits, but only 
548 firms disclosed the pay-as-you-go cost. When compared to the firms not providing health care 
benefits, the benefit providers are, on average, larger firms (market value of equity and number 
of employees six and five times larger, respectively), with lower price-earnings ratios (14.5 
compared to 21.4), and higher S&P bond ratings (Warshawsky 1992, 113).!°As discussed later, 
the firms providing health care benefits come from a variety of industries. 


“Fifteen firms disclosed inconsistent amounts as pay-as-you-go costs at least once in adjacent fiscal years. This was 
attributable to restatements of prior years' cost to reflect a change in reporting entity (e.g., a major acquisition or 
divesture, or, for some 1988 data, consolidation of a majority-owned subsidiary that was previously reported under the 
equity method). For these firm-year observations, we included the pay-as-you-go cost that appeared to be consistent with 
other data for that observation. 

15 We greatly appreciate Mark Warshawsky's willingness to provide the names and CUSIPs of these firms. 

t6 See Warshawsky et al. (1993) for additional comparisons of firms sponsoring health care plans and firms not spon- 
soring plans. 
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All 187 firms in the utility and finance industries are eliminated for two reasons. First, for 
utilities the treatment of health care costs is regulated by state commissions." Second, financial 
ratios of utilities and financial institutions are much different than ratios of firms in other 
industries and cash flow data are not collected by Compustat for these two industries. Five firms 
are eliminated because the reductions in retiree health benefits occurred prior to 1989. This cutoff 
is necessary because SFAS No. 81 disclosure requirements did not become effective until 1985, 
and pay-as-you-go cost information for the four years prior to the reduction year (year 0) are 
needed for the time-series analysis.’ Twenty-two firms that accrued retiree health benefits and 
15 firms that did not distinguish pay-as-you-go cost for active and retired participants prior to 
adoption of SFAS No. 106 are eliminated because pay-as-you-go costs for these firms are not 
comparable to the retiree pay-as-you-go costs disclosed by other firms. We eliminate 98 firms 
because of missing Compustat or hand-collected data. To perform the industry-adjusted analyses, 
we eliminate 19 firms from nine industries represented by fewer than four firms: These 
procedures result in a sample of 202 firms: 71 that reduced health care benefits and 131 that did 
not reduce benefits. 


Descriptive Statistics 


For the 71 firms reducing benefits, 16 capped employer contributions, 15 increased 
copayment amounts, five tied benefits to years of service, two changed to a defined contribution 
plan and eight eliminated health care benefits for certain classes of existing employees.” In 
addition, 25 firms stated that benefits were reduced, but did not specify how benefits were 
reduced. Of the 46 firms disclosing benefit-reduction type, 67 percent capped expenses or 
increased copayments. 

Within each benefit-reduction category, reductions can be applied to four employee groups: 
new hires, fully eligible active plan participants, partially eligible active plan participants and 
retirees. For the 35 firms that disclose employee group, 43 percent of the benefit cuts relate to all 
participants (including retirees) and new hires, 51 percent relate to all active participants and new 
hires, and six percent relate to partially eligible active participants or new hires. 

Thirty-four firms report the dollar effects of their cuts in retiree health benefits. The ratio of 
cut amount to the pre-cut SFAS No. 106 liability ranges from 0.2% to 80.0% with a mean of 25.7% 
and a median of 22.0%. The ratio of the benefit-cut amount to the market value of common equity 
ranges from .05% to 222 percent with a mean of 15.7% and a median of 2.3%. The benefit-cut 
amounts as a percentage of market value exceed the costs of technical default (range of 48.83% 
to 27.84%, mean of —3.5296, and median of —1.4996) reported by Beneish and Press (1995, 343). 
Because shareholders bear the entire median 1.4996 loss from technical default, but plan 
participants bear much of the loss associated with the reduction in health care benefits, firms have 
incentives to reduce retiree benefits to avoid the costs of technical default.?! 


Y’ Souza (1994) examines the decision to reduce retiree heelth care benefits by firms in the electric utility industry. She 
focuses on several factors that are unique to the utility industry. In addition, Khurana and Loudder (1994) report results 
that are consistent with investors perceiving the wealth effect of SFAS No. 106 passage to be different for public utilities 
than for firms in other industries. 

A These five firms are included in the frequency analysis of health care benefit reductions by benefit-cut year (see first 
line in table 2, panel A). 

9 Non-industry-adjusted results including the 19 firms are very similar to the results reported for the 202-firm sample. 

Seven firms that reduced benefits for existing employees also eliminated benefits for new hires. 

^The wealth effect for stockholders from retiree benefit reductions depends on whether future contracting costs with 
employees increase as a result of breaching an implicit contract to continue retiree health care benefits at present levels. 
The effect on security prices at the time of the benefit reduction is an empirical issue. 
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There does not appear to be a clear ordinal relation between type and severity of benefit cut. 
Capping expense, increasing copayments or tying benefits to years of service can each result in 
cuts that exceed 39 percent of the pre-cut SFAS No. 106 liability. Depending on the specific terms 
of the plan amendment, each type of cut may result in mild or severe reductions in benefits. 

Table 1 analyzes the industry membership of the benefit-cut and no-cut firms. The industry 
groupings follow those used in Biddle and Seow (1991) and Warshawsky et al. (1993). The 
sample includes firms from a variety of industry groupings and there is minimal industry 
clustering. Of the 24 industry groups, only eight groups represent more than five percent of the 
sample and no group represents more than eight percent of the sample. The proportion of benefit- 
reducing firms within an industry grouping equals or exceeds 50 percent in the publishing, 
electronic components, computers, automobiles and air transportation industry groupings. No 
firms reduce benefits in the mining industry or the glass, cement and ceramic industry group. 

In our sample, 35.1% of the firms reduced health care benefits. This statistic may be more 
meaningful when compared to Thomas’ (1989, table 3) analysis of firms with overfunded pension 
plans. When the Utilities and Finance industries are deleted from Thomas' sample, to make it 
comparable to ours, 33.496 of firms with overfunded pension plans take action to recapture 
overfunding either immediately by plan termination (11.1%), or slowly by plan contraction 
(22.396). The 33.496 frequency is similar to the 35.196 frequencv of benefit reductions observed 
in our sample. However, recall from section III, that pension plan modifications usually do not 
reduce the ultimate retirement benefit, whereas health care reductions represent a direct wealth 
transfer from plan participants to firms' other claimholders. | 

Table 2 (panel A) shows the number of sample firms reducing health care benefits in each 
year from 1982 to 1992, the U.S. medical and U.S. general inflation rates for each of those years, 
and the sample-specific medical inflation rate from 1987 to 1992 for the firms in our sample 
partitioned into benefit cutters and non-cutters. The 76 firms reducing retiree health care benefits 
(first line in panel A) include five firms that cut benefits prior to 1989. As discussed earlier, these 
firms are eliminated in all subsequent analyses because pay-as-you-go cost data needed for the 
time-series analysis were not disclosed during the years prior to the benefit cuts. 

Consistent with Warshawsky's (1992, 110) findings, even though U.S. medical inflation 
exceeded U.S. general inflation throughout the 1980s, only 10.596 of the health care benefit cuts 
are made prior to 1990. Seventy-one percent of the benefit cuts are announced in 1992, the year 
prior to mandatory adoption of SFAS No. 106. The difference between U.S. medical and U.S. 
general inflation in 1992 is not as high as the differences observed in four of the previous ten years. 
Furthermore, in 1993 and 1994 U.S. medical inflation fell to 5.796 and 4.596, respectively, and 
the difference between U.S. medical and U.S. general inflation fell to 2.796 and 2.096, respec- 
tively.? Thus, it is difficult to argue that 1992's high medical inflation alone led to the large 
number of benefit cuts observed in that year. In addition, in 1992 the benefit-cut firms experienced 
a lower medical inflation rate (8.096) than the no-cut firms (11.596). However, the opposite 
relation occurred in four of the five preceding years: benefit-cut firms experienced a 23.396 higher 
cumulative compounded medical inflation rate from 1987 through 1991 than no-cut firms. Firm- 
specific medical inflation is analyzed in further detail in section V. 


“Farther, SFAS No. 106 (par. 74.d) requires disclosure of the assumed health care cost trend rate for the next year and 
a description of the future direction and pattern of changes in the assumed trend rate thereafter. An examination of these 
disclosures for firms that adopted SFAS No. 106 in 1992 indicates that managers expected medical inflation to trend 
downward in the future. 
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Industry 

Mining 

Oil & Gas Exploration 
Food and Tobacco 
Paper 

Publishing 

Chemicals 
Pharmaceuticals 
Specialty Chemicals 
Petroleum Refining 
Rubber, plastic, leather 
Glass, cement, ceramic 
Steel 

Metalworks 

Metal parts 

- Industrial Equipment 
Small Industrial Mach. 
Electronic Components 
Computers 
Automobiles 

Aircraft 

Misc. Manufacturing 
Commercial Transport 
Air Transport 


Department and Specialty Stores 


Total 


Note: Industry classification is based on Biddle and Seow (1991) and Warshawsky et al. (1993). 


TABLE 1 


Industry Membership of Sample 


2 and/or 3 
Digit SIC 
10-12, 14 
13, 353 
20, 21 
26 
27 
280-282 
283 
284—289 
29 
30-31 
32 
331-332 
333-336 
339, 34 
351, 352, 354 
355, 356, 358, 359 
367 
357, 368 
371, 375 
372, 376 
38, 39 
373, 374, 319, 40, 42, 44, 46 
45,47 
53, 55—59, except 591 


Benefit- 


Cut Firms 


0 
3 
4 
3 
3 
3 
2 
1 
3 
3 
0 
2 
3 
6 
6 
4 
2 
4 
6 
5 
2 
3 
2 
J 


71 


547 


No-Cut 
Firms 
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Panel B of table 2 provides additional insights regarding the effect of SFAS No. 106 on the 
timing of benefit cuts. Although SFAS No. 106 was not required to be adopted until 1993, 69.0% 
of the benefit-cut firms had adopted it by the end of 1992.? Note that 88.7% of the benefit-cut 
firms (63 out of 71) reduce health benefits within one year of adopting SFAS No. 106. Panel B 


= -five percent of the no-cut firms had adopted SFAS No. 106 by the end of 1992. A Chi-squared test of equal proportions 
indicates that the proportion of early adopters is significantly higher for the benefit-cut firms (p-value of .05). 
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TABLE 2 (Continued) 
Benefit-Cut Year, Medical Inflation Statistics and SFAS No. 106 Adoption Year 


Panel B: Comparison of Benefit-Cut Year and SFAS No. 106 Adoption Year 
| SFAS No. 106 Adoption Year 





91 92 93 Row Total 
Benefit- Frequency Frequency Frequency Frequency 
Cut Year Cell % Cell % Cell % Column % 
89 0 2 1 3 
0.0% 2.8% 14% 4.2% 
90 l 1 2 4 
2.8% 5.6% 
91 2 10 
x 2.896 14.196 
92 3 17 54 
4.296 23.9% 76.1% 
Column Total 
Frequency 8 4l 22 TI 
Row 96 11.296 57.8% 31.0% 100% 


Note: In Panel A, U.S. medical and U.S. general inflation rates are based on indices obtained from the Bureau 
of Labor Statistics. Sample-specific medical inflation rates are the median of the firm-specific percent 
change in pay-as-you-go cost to sales revenue ratios for the corresponding years. For the no-cut firms, 
medians are calculated from the same 131 observations each year; forthe benefit-cut firms, the number 
of observations vary from 71, in 1987, to 54, in 1992, because medians are calculated after excluding 
firms that cut benefits in prior years. In Panel B, shaded cells represent benefit cuts disclosed in the 
same year as the SFAS No. 106 adoption. 


also highlights the need to control for the effects of SFAS No. 106 adoption in financial statement 
analysis for the post-adoption years (these procedures are discussed above). 

Table 2 suggests that SFAS No. 106 had a considerable impact on benefit-cut decisions. In 
the analysis that follows, we attempt to control for factors other than SFAS No. 106 that could be 
causing the clustering of benefit cuts near the mandatory SFAS No. 106 adoption date. 


V. EMPIRICAL RESULTS 


Univariate Results 


Table 3 presents means and medians for each independent variable partitioned into benefit- 
cut and no-cut firms and p-values from tests of between-partition differences. Four columns of 
p-values are presented, two columns for t-tests of differences in means and two columns for 
Wilcoxon rank sum tests of differences in medians. For both the mean and median analyses, the 
p-values in the first column report tests using unadjusted variables and the p-values in the second 
column report tests using industry-adjusted variables. For the tests using unadjusted data, a 
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TABLE 3 
Comparison of Benefit-Cut and No-Cut Firms 
Analysis of Means Analysis of Medians 
t-statistic Wilcoxon 
p-value p-value 
Cut No-Cut Non- Cut No-Cut Non- 
Independent Rel. Firms Firms Ind. Ind. Firms Firms Ind. Ind. 
Variables Year n-7l n=131 Adj. Adj. nz/l n-l3l Adj Adj. 
Increased Contracting Cost Variable 
SFAS No. 106 
Liability to Total 
Assets x Total Debt 
to Total Assets 0 060 .032 .00 .00 031  .020  .00 .00 
Financial Weakness Variables 
Total Debt to Total -3 630  .601 .14 .01 618 .583  .05 .01 
Assets —2 636  .598 O8 .04 .620 .588  .03 .04 
-l .644  .601 .06 .03 .627 .586 Ql .02 

0 665 .606 .02 02 640  .593  .01 .01 
After-tax Income -3 048 .056 — 10 24 045  .054 «11 19 
from Continuing —2 033  .053 = .02 09 .045 .048  .10 .20 
Operations to Total -] 013  .035  .03 .04 028 | .032  .18 .20 
Assets 0 012 4.0031  .03 .01 .019 .040  .03 .02 
Cash from -3 100 .099  .55 56 090 | .101  .33 .29 
Operations to Total —2 090 .096 ~~ 25 35 086 .099 25 .44 
Assets -1 .079 | .093  .05 09 80 087  .15 .20 

0 .062 .097 .00 00 072 .093  .01 01 

Health Care Cost Variables 

Pay-as-you-go Cost -3 .40 34 .2i 49 .26 22 .04 .04 
as a Percent of Sales —2 46 32 .02 .06 31 23 .01 .01 
Revenue -1 55 37 .01 07 .36 21 .00 .01 

0 .62 42 .01 .06 41 28 .00 .03 
Percentage Increase -3 9.7 10.9 .61 42 8.1 6.5 .47 .56 
in Pay-as-you-go —2 20.1 6.7 .00 .00 12.4 6.8 .00 .O1 
Cost as a Percentof -1 22.1 18.7 .22 21 169 14.0 .20 .38 
Sales Revenue 0 19.3 16.2 .30 .19 10.1 11.5 .67 .68 


Note: All p-values are for one-tailed tests. The “Non-Ind. Adj." columns are for comparisons of the raw 
variables before industry adjustments. The "Ind. Adj." columns are for comparisons of industry- 
adjusted variables defined as (the non-industry-adjusted variable less the industry median for the 
variableYthe industry interquartile range for the variable. The underlying means and medians for the 
industry-adjusted variables are not shown in the table. Industry assignment is based on the industry. 
groupings shown in table 1. “Rel. year” is the year relative to year 0, defined for benefit-cut firms as 
the year that the reduction in health care is disclosed in financial staternents and for no-cut firms as a 
randomly assigned year. The random assignment is based on the fiscal year distribution observed for 
benefit-cut firms. The effects of SFAS No. 106 early adoption are removed when computing ratios 
involving debt or income. 
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significant p-value is consistent with finding firm, industry, or both firm and industry differences. 
Rejection of the null hypothesis using industry-adjusted data implies that benefit-cut firms are 
worse off than no-cut firms when compared to representative firms in their industries. 

In most cases, the significance levels from comparisons of the means and medians using 
either the unadjusted or industry-adjusted data are similar.“ As discussed in section IV, all 
variables except the SFAS No. 106 interaction variable are calculated for years —3 to year 0. The 
variables are ordered by hypothesis. 

Beginning with the increased contracting cost hypothesis, the univariate results indicate that 
benefit-cut firms have significantly higher interaction variable values than no-cut firms for both 
unadjusted and industry-adjusted data. In fact, the mean interaction variable is almost twice as 
` large for benefit-cut firms (.060) compared to no-cut firms (.032) with the comparable medians 
being 50 percent larger (.031 for benefit-cut firms and .020 for no-cut firms). Such a finding is 
consistent with the hypothesis predicting an interaction of financial leverage with the magnitude 
of the SFAS No. 106 effect and suggests that managers cut health care benefits in part because 
of the statement's adverse financial reporting effect on debt covenants. 

Mean and median differences between the three variables used to proxy financial weakness 
(total debtto total assets, after-tax income from continuing operations to total assets and cash from 
operations to total assets) indicate that benefit-cut firms are financially weaker than no-cut firms 
in year 0 when the benefit cutis announced. This finding is supported by both parametric and non- 
parametric tests, using unadjusted and industry-adjusted data. The total debt to total assets ratio 
is also significantly different in most comparisons for the years preceding the benefit cut. 
However, the results for the two other financial weakness variables for the years prior to year 0 
are less consistent. The differences for unadjusted data generally become more significant over 
the four-year period (for example, the p-value for mean differences in the ratio of cash from 
operations to total assets declines continuously from .55 in year —3 to .00 in year 0).? However, 
the differences for industry-adjusted data in the years preceding the benefit-cut announcement are 
harder to characterize: They follow no clear pattern and are often not significantly different from 
zero. A preliminary characterization of these results for the mean and median analyses taken 
together is that the financial weakening experienced in years —3 through —1 for benefit-cut firms 
was also being experienced by their industries as a whole. This interpretation is supported by the 
insignificant industry-adjusted differences in medians for the variables: after-tax income from 
continuing operations to total assets and cash rrom operations to total assets. However, in the year 
of the benefit-cut announcement, the benefit-cut firms experienced greater financial weakening 
than no-cut firms in their industry, regardless of the variable used to proxy financial weakening. 

These results are consistent with the financial weakness hypothesis, but the financial 
weakness exhibited by the benefit-cut firms is not as strong as that exhibited by the pension 
terminators or pension contractors reported in Mittelstaedt (1989) and Thomas (1989). (Neither 
study examined industry-adjusted data.) Thomas' (1989, 385) results indicate significantly lower 
funds flows from operations five years prior to pension plan termination and two years prior to 
pension plan contraction. Mittelstaedt (1989, 408) indicates significantly higher bankruptcy 
scores for both pension terminators and contractors three years prior to the termination or pension 


^ Winsorizing data does not affect the univariate or multivariate analyses or conclusions. 

Z5 One would expect the flow variables measuring financial weakness (after-tax income from continuing operations to total 
assets and cash from operations to total assets) to decrease over time because most of the sample firms’ years -3 to -1 
correspond with 1989 to 1991, a period including the July 1990 to March 1991 recession (see, Economic Report of the 
President 1993, 73). 
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funding contraction.” He also finds that the change in the bankruptcy score from year —3 to year 
O was significantly higher for the plan terminators. Significant differences in medians of the flow 
variables in our study do not occur until year 0. 

The results for the firm-specific retiree health care costs are mixed. Unadjusted and industry- 
adjusted median values of pay-as-you-go costs to sales revenue are significantly higher for the 
benefit-cut firms in all four years, and mean differences for the same variable are significantly 
higher in three of the four years. Comparisons for the second variable, the percent increase in pay- 
as-you-go costs to sales revenue, are significant in year —2 only. 

Thus, the results in table 3 provide support for the increased contracting cost hypothesis, the 
financial weakness hypothesis, and, to alesser extent, the firm-specific increase in health care cost 
hypothesis. Further evidence on the three hypotheses is provided in the multivariate analysis 
which follows. 


Multivariate Analysis 


To contro] for systematic differences existing in the years prior to the benefit cuts and for 
changes in firm-specific conditions occurring concurrently with the impending adoption of SFAS 
No. 106, we estimate logit models incorporating both change and level variables." The four 
change variables are designed to capture the change in financial strength and health care costs: 
change in total debt to total assets (ATDTA); change in after-tax income from continuing 
operations to total assets (AINCTA); change in cash from operations to total assets (ACASHTA); 
and change in pay-as-you-go cost to sales revenue (APAYSAL,). These variables are calculated 
from year —3 to year 0 or year -1 depending on the particular model. Level variables include total 
debt to total assets in year -3 (TDTA_,) and the pay-as-you-go cost to sales revenue in year —3 
(PAYSAL ,). The inclusion of these variables at year —3 is based on the significant differences 
reported in table 3. The interaction variable, representing the increased contracting cost associ- 
ated with SFAS No. 106 (F106INT), is defined as: SFAS No. 106 liability (net-of-tax) to total 
assets times total debt to total assets. Unadjusted and industry-adjusted data are used in separate 
estimations of the model for both year 0 and year —1. For all models, the dependent variable is 
coded one for benefit-cut firms and zero for no-cut firms. 

Results of logit estimations are presented in table 4. The —2 log-likelihood ratio indicates that 
all four estimates bave significant explanatory power with the year O estimates providing 
marginally more explanatory power than the year —1 estimates. The signs and significance levels 
forthe coefficients in the models are consistent with the results obtained in the univariate analysis. 
The coefficient for the increased contracting cost hypothesis variable is significant in all four 
estimations. The results also indicate that financial weakness influences the benefit-cut decision. 
The coefficient for ACASHTA is significantly negative in three of four models. The coefficients 
for ATDTA and AINCTA are insignificant for all models. However in both year 0 and year —1 the 
TDTA , coefficient is significantly positive for the industry-adjusted model.?? Firm-specific 
increases in health care cost appear to play a negligible role in explaining the benefit-cut 
decisions: the unadjusted variable APAYSAL, is significant in the year —1 estimation, but none 
of the other coefficients using either PAYSAL, , or APAYSAL are significantly different from zero. 


?5'The Mittelstaedt financial weakness measure is based on a probit model with the following variables: earnings before 
interest and taxes divided by total asset, retained earnings divided by total assets, market value of equity divided by total 
debt, working capital divided by total assets and total assets divided by sales. 

27 Previous literature includes level and change variables in models predicting pension plan terminations (see Mittelstaedt 
1989). 

?* The results for the debt variables are stronger if, instead of the Interaction variable, thc percent change in total debt to 
total assets caused by SFAS No. 106 is used to test the increased contracting cost hypothesis (see footnote 11). 
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TABLE 4 
Logit Models of Cuts in Retiree Health Care Benefits 


Coefficients (one-tailed p-values for t-statistics in parentheses) 


Year 0 Year -1 
Non- Non- 

Independent Expected Industry Industry Industry Industry 
Variables Sign Adjusted Adjusted Adjusted Adjusted 
INTERCEPT —1.34 (.00) —76 (.00) —1.39 (.01) —75 (.00) 
F106INT + 12.4 (.03) .51 C01) 10.4 (.04) .51 (01) 
TDTA , + .25 (.40) .50 (.01) .27 (.38) .42 (.03) 
ATDTA + 1.94 (.13) .16 (21) 2.62 (.12) .01 (.46) 
AINCTA - 3.04 (.86) —.10 (.26) .91 (.62) .005 (.52) 
ACASHTA - —2.89 (.05) —29 (.01) —1.14 (.24) ~.19 (.09) 
PAYSAL, + —21.8 (.63) ~.09 (.76) —6.76 (.54) —.01 (.53) 
APAYSAL + 106.5 (.12) ~.001(.50) 237.7 (.03) .14 (.11) 
—2 Log-likelihood 

ratio 22.5 28.9 20.0 23.4 
Significance level .002 .000 .006 .001 


Note: For all models, the dependent variable is coded one for benefit-cut firms and zero for no-cut firms. The 
non-industry-adjusted independent variables are defined as follows: F106INT, ,, = SFAS No. 106 
liability (net-of-tax) to total assets x total debt to total assets at year 0 (—1); TDTA_, = total debt to total 
assets at year 3; ATDTA,, ,, = change in debt to total assets from year —3 to year 0 (—1); AINCTA,. = 
change in after-tax income from continuing operations to total assets from years —3 to year 0 (—1); 
ACASHTA,, qp = change in cash from operations to total assets from year ~3 to year 0 (-1); PAYSAL , = 
pay-as-you-go cost to sales revenue at year -3; and APAYSAL,. p = change in pay-as-you-go cost to 

sales revenue from year —3 to year 0 (—1). The industry-adjusted variables are defined as (the non- 

industry-adjusted variable less the industry median for the variable)/the industry interquartile range for 
the variable. The —2 log-likelihood ratio has a chi-square distribution. 


Pearson pairwise correlations and multiple correlation coefficients for unadjusted and 
industry-adjusted variables indicate that collinearity should not impair estimates. In year O, the 
highest correlations are between PAYSAL , and APAYSAL (—.50 for unadjusted variables and 
—.60 for industry-adjusted variables) and PAYSAL , and F106INT (.44 for unadjusted variables 
and .22 for industry-adjusted variables). The rest of the correlations for the industry-adjusted 
variables are relatively low (the highest is .36 and most are below .25 in absolute value). The 
correlation coefficients for the unadjusted variables are typically larger than those for the 
industry-adjusted variables with the correlations of the unadjusted financial weakness variables 
ranging from —.43 to .40. Multiple R-squared values for the unadjusted variables in year O are .63, 
.65 and .66, for APAYSAL, F106INT and PAYSAL „respectively; the multiple R-squared values 
for the remaining independent variables are below .34.” For the industry-adjusted variables in 


* The multiple R-squared for an independent variable is defined as the square of the maximum correlation between an 
independent variable and a linear function of other independent variables. It is computed by regressing one independent 
variable on the other independent variables (see Weisberg 1985, 49 and 197). 
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year 0, APAYSAL and PAYSAL , have multiple R-squared values of .51 and .49, respectively; 
the multiple R-squared values for the remaining independent variables are below .32. The 
pairwise and multiple correlations are low enough that parameter estimates should not have 
inflated variances (see Weisberg 1985, 198). 

Even though collinearity appears low, the increased contracting cost variable could be 
masking real cash flow effects associated with high, or increased, retiree health care benefit 
payments. To explore this possibility, we re-estimate the model without the increased contracting 
cost variable. The coefficient for PAYSAL , remains insignificant in all four model estimations, 
while the p-values for the APAYSAL coefficients decrease from .12 to .02 in the non-industry- 
adjusted year 0 model and from .11 to .02 in the industry-adjusted year —1 model. In addition, the 
ATDTA variable becomes significant in the year 0 models and in the non-industry-adjusted year 
—1 model. The re-estimated models are significant at levels ranging from .002 to .018, but the log- 
likelihood ratios decrease by 18 percent (relative to the statistics reported in table 4) for the 
unadjusted models and by 38 percent for the industry-adjusted models. The drop in overall 
explanatory power is consistent with the omission of an important explanatory variable, which 
is known to be correlated with APAYSAL, and PAYSAL ,. Therefore, the more significant results 
for the increased health care cost variables should be interpreted with caution. 

In general, the multivariate results provide additional support for the univariate results and 
conclusions: cuts in health care benefits are related to increased contracting cost caused by the 
adverse financial reporting effect of SFAS No. 106 and financial weakness independent of SFAS 
No. 106. There is limited evidence in the multivariate results that supports the importance of firm- 
specific increases in health care costs affecting the decision to cut retiree health benefits.?? 

At a minimum, the foregoing results suggest that SFAS No. 106 precipitated management 
actions regarding the timing of cuts in retiree health care benefits. The data indicate a strong 
associative relation between the decision to cut retiree health care benefits and the requirement 
to adopt SFAS No. 106. Additionally, deliberations regarding SFAS No. 106 preceded virtually 
all decisions to cut benefits. Although the data are insufficient to conclude definitively that SFAS 
No. 106 caused firms to reduce retiree health care benefits, two of the three requirements 
necessary to establish such a link, association and temporal sequencing (Abdel-khalik and 
Ajinkya 1979, 26), are present. Further, relative to the final condition, elimination of competing 
hypotheses, we have eliminated or controlled for three plausible alternative explanations: 
financial weakness, firm-specific retiree health care cost and industry. 


VL CONCLUSION 


This study utilizes benefit reductions in retiree health plans for two purposes. First, the study 
extends accounting research examining management actions in response to a mandated account- 
ing change. Second, it examines whether the decision to reduce retiree health care benefits is 
influenced by the same factors influencing the decision to reduce pension plan overfunding. 

It appears that cuts in retiree health care benefits are related to anticipated or actual adoption 
of SFAS No. 106. The majority of firms cutting retiree health care benefits (89 percent) did so 
within one year of adoption of SFAS No. 106. The cuts in health care benefits do not appear to 
be prompted by general increases in medical inflation, which existed at very high levels for ten 
years prior to the 1991—1992 surge in benefit cuts. Instead they appear to be driven by increased 


* We also examined the following specifications of the model: (1) including the log of total assets as a control for size; 
(2) partitioning results by type or magnitude of benefit cut; and (3) controlling for industry effects by subtracting means 
instead of medians. All of these model specifications provide qualitatively similar results. 
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contracting cost associated with the financial reporting consequences of SFAS No. 106, financial 
weakness independent of SFAS No. 106, and, to a lesser extent, firm-specific increases in health 
care cost. These findings are robust across industry-adjusted and unadjusted analyses. The 
benefit-cut firms exhibit less financial weakness than reported in prior studies for firms that 
reduced pension plan overfunding. 

This study explicitly examines the increased contracting cost for debt hypothesis, which we 
regard as only one of several reasonable scenarios resulting in linkages between SFAS No. 106 
adoption and management decisions to reduce retiree health care benefits. A second scenario is 
that managers reduced the adverse financial reporting impacts of SFAS No. 106 in anticipation 
of decreasing costs associated with future financing. This scenario is a broader view of the 
traditional debt-contracting hypothesis which focuses on renegotiation costs for existing debt. A 
firm with high leverage that desires to issue new debt is in a position similar to that of a firm in 
technical default. A third scenario is that the necessary examination by managers of their firms’ 
retiree health care liabilities pursuant to adoption of SFAS No. 106 accelerated decisions to reduce 
retiree benefits, and as a consequence, benefit reductions clustered near the time of SFAS No. 106 
adoption. Under this scenario, the same firms would have reduced benefits, but they would have 
accomplished it over a longer time period as it became increasingly apparent that they could not 
meet their obligations for retiree health care benefits. The plausibility of these other explanations 
for describing management behavior relative to SFAS No. 106 adoption is a topic for further 
research. 
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ABSTRACT: This paper explores a possible explanation for the observed use of 
bonus pool arrangements. Under fairly general conditions, the use of a bonus pool 
arrangement results in a strict Pareto improvement by enabling an owner to exploit 
non-contractible information, that might otherwise not be used, to motivate agents. 
We characterize the optimal bonus pool arrangement and analyze the interdepen- 
dencies which it induces between agents. Our results demonstrate the manner in 
which the use of non-contractible Information via bonus pool schemes distorts the 
payoffs to agents relative to the case in which the non-contractible information is not 
used. 
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I. INTRODUCTION 


20 percent of the company’s pre-tax income for all incentive awards....Individual 

executive awards are determined based on the committee's discretionary evalu- 
ation (Waterhouse Investors Services, Inc.; Proxy Statement filed with the SEC, January 
4, 1995) 


While the standard for determining the amount of the bonus pool for each profit 
center is based entirely on financial performance criteria, the allocation of the bonus pool 
among the employees entitled to participate therein involves subjective considerations 


T he plan provides for a bonus pool to be reserved in the amount of approximately 
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as well. Awards from the various bonus pools to executive officers are made based upon 
the Board’s subjective evaluation of corporate performance, business unit performance 
(in the case of executive officers with business unit responsibility) and individual 
performance. (Quick and Reilly Group, Inc.; Proxy Statement filed with the SEC, May 
20, 1994) 

...the amount distributable under the Plan consisted of 15 percent of the three year 
audited pre-tax earnings increase of the Company. Distribution of individual awards 
under the bonus pool was made at the discretion of the Chairman of the Board and 
President. (Crawford & Co.; Proxy Statement filed with the SEC, March 31, 1992) 


The above incentive plans, referred to as bonus pool arrangements, have several common 
features. First, there are multiple individuals covered by each plan. Second, the method of 
determining the total amount of the bonus pool is based on an explicit formula (usually involving 
accounting earnings) and is agreed-upon ex ante. Third, the manner in which the bonus pool is 
allocated among the covered individuals is not previously agreed-upon, but rather is left to the 
discretion of the compensation committee. With respect to this latter point, Fox (1979, 10) notes 
in his survey of 211 bonus plans in manufacturing firms: “Within the general limits set by the 
bonus fund each year, the executive compensation committee has relatively wide discretion in 
fixing the amount of an individual award.” 

This paper shows that, under fairly general conditions, bonus pool arrangements enable an 
owner to exploit non-contractible information to motivate agents and thus provide a strict Pareto 
improvement.! We also characterize the optimal bonus pool arrangement and show that it induces 
interdependencies between agents, as well as distortions in their payoffs, relative to the case in 
which the non-contractible information is not used. These distortions are similar to empirically 
observed group payment schemes where apparently independent divisions are rewarded on the 
basis of relative performance evaluation (RPE) or total firm profits. Our results also have 
important implications for the evaluation of managerial accounting systems. Using the manage- 
rial accounting system to collect and report information can be valuable either because it reports 
otherwise unavailable information, or because it verifies information which otherwise could not 
be used within explicit contracts.? Therefore, when computing the value of the accounting 
system, one must consider whether the relevant benchmark is the case in which that information 
is not otherwise available, or the case in which it is available but not contractible. Most previous 
analyses have treated these cases as equivalent and used the first benchmark. This paper analyzes 
how the second benchmark can be established when the non-contractible information is used 
within a discretionary bonus pool. 

The paper is organized as follows. Section II provides the intuition underlying our analysis 
of bonus pool arrangements. Section III contains a review of the relevant literature. Section IV 
discusses the basic model of the firm and contains the major results. Section V extends the results 
derived under our base model to more general settings. Section VI considers the use of contracting 
arrangements other than bonus pools and provides concluding comments. 


! Incomplete contracts are a widely observed phenomenon and the use of discretion to partially complete these contracts 
is also widespread. Not all such discretionary uses of information involve bonus pool arrangements. Our objective is 
to identify conditions under which a bonus pool arrangement is one efficient way of incorporating discretionary behavior 
within employment contracting. 

? The latter case arises when the information may already be available through informal channels such as by "walking 
around the factory or office." 
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IL INTUITION 


To understand the demand for bonus plans, consider the following example. Assume a single- 
period game with a principal and a single manager, where the latter is subject to moral hazard. 
Suppose that there is some information which, if it could be incorporated into the employment 
contract, would result in a Pareto improvement. However, assume that only the principal can 
observe the information, so the information cannot be explicitly incorporated in the contract. In 
this setting, granting the principal the discretion to pay a bonus based on the non-contractible 
information creates a moral hazard problem with respect to the principal’s use of the information. 
The principal would not pay a discretionary bonus at the end of the period based on the non- 
contractible information, because it would merely reallocate resources from him to the manager, 
without improving the manager’s incentives. 

Reputation considerations may mitigate the principal’s moral hazard problem with respect 
to the paying of a discretionary bonus. However, for there to be a demand for reputation, one must 
either assume an infinitely repeated game in which the non-contractible information is observed 
by individuals other than just the principal, or that the principal has some inherent characteristic 
or "type" that is not known to others (as in Kreps and Wilson 1982). 

Another potential way in which to mitigate the principal’s moral hazard problem, and thus 
to allow for the use of this non-contractible information, exists if there is a second manager. In 
this case, the principal can commit to fund a bonus pool based only upon the contractible 
information. The principal commits that the entire amount of the bonus poo! will be paid out to 
the managers as compensation. However, the actual allocation of the bonus pool between the 
managers is left to the principal’s discretion, and therefore can be based on the non-contractible 
as well as the contractible information. 

Setting up such a bonus pool arrangement can have both positive and negative effects. The 
positive effect is that it allows the managers to be compensated on better information than just the 
contractible information. For example, it may improve risk-sharing among the managers and the 
principal. However, allocating the bonus pool may have negative motivational effects as it sets 
up à zero-sum game between the managers with respect to the non-contractible information. We 
find that using a bonus pool results in a strict Pareto improvement so long as the non-contractible 
information is informative about at least one of the managers. This improvement arises despite 
the fact that the optimal discretionary bonus allocations lead to considerable distortions in the 
managers’ compensation arrangements, relative to the optimal no-bonus-pool contract. 


III. LITERATURE REVIEW 


Williamson (1975) asserts that a major difference between firms and markets is the 
widespread use of discretion (or fiat) in the former. Examples of discretion include: non-seniority 
based promotion, the business judgment rule for boards of directors, etc. Our work analyzes the 
demand for one particular type of discretionary behavior—the use of discretionary bonus pool 
schemes.’ 

The general problem in the incomplete contracts literature is to set up ex ante a governance 
structure which controls the opportunistic use of such non-contractible information (e.g., 
Williamson 1975, 1987; Green and Laffont 1987; Hart and Moore 1988). Of particular interest 


> The discretionary use of subjective information to allocate bonus pools is pointed out by Demski and Sappington (1993) 
and Kaplan and Atkinson (1989). 
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is how the existence of the non-contractible information distorts the initial investment and 
productive actions of the principal and manager. The present paper focuses on the discretionary 
payment of a bonus, rather than on a discretionary productive action. Further, in our model there 
is no renegotiation, and hence no explicit governance structure within which to renegotiate 
(although the set of rules governing the size of the bonus pool is somewhat analogous to the 
governance structure). In the present model, all of the discretion is vested in the principal. 
Somewhat related to this literature is Demski and Sappington (1993), who also examine a context 
in which there is non-contractible information. However, their concern is how to efficiently elicit 
that information for use in explicit contracting, not its use as a basis for discretionary payments 
by the principal. 

It has been noted elsewhere that one of the costs of giving a supervisor the right to use 
discretion in allocating jobs and promotions among agents is that it may induce the agents to 
engage in non-productive activities whose sole purpose is to influence the supervisor's discretion- 
ary decision (e.g., Milgrom 1988; Milgrom and Roberts 1987). The supervisor pays attention to 
these activities, and the agents engage in them to communicate information about their abilities. 
In our model, the agents are not better informed than the principal, and hence such influence 
activities do not arise.^ 

Previous work in the implicit contracting literature has shown that a Pareto improvement can 
be achieved through the sequentially rational use of non-contractible information. The logic 
underlying this work is also applicable to the case of employment contracting, but much of that 
earlier work is based on different assumptions from those used here. For example, much of that 
work assumes a demand for reputation by specifying either an infinitely-repeated world in which 
the non-contractible information is observed by individuals in addition to the principal, or a world 
in which the principal is privately informed about his type and chooses his observed use of the 
non-contractible information to signal that type (e.g., Bull 1987; Carmichael 1989; Kreps 1990). 
We assume a single-period setting in which the principal does not have a privately known type, 
and our analysis holds even if only the principal observes tae non-contractible information. 
Hence, issues of reputation do not arise in our model. Our paper thus provides a complementary 
explanation to that offered by the implicit contracting literature for the Pareto improvement 
resulting from the use of non-contractible information in employment contracts. 

Most closely related to the present paper is that subset of the implicit contracting literature 
which motivates the use of rank-order tournaments by their ability to use non-contractible 
information (Malcomson 1984, 1986). While this motivation is similarto our logic for the use of 
bonus pools, our analysis generalizes this work along several dimensions. First, rank-order 
tournaments are a simplified form of the bonus pool arrangements studied here. For example, in 
a tournament the total prize to be paid is fixed and independent of the actual level of performance. 
In contrast, we allow the size of the bonus pool to vary in performance. As a consequence, we 
provide more general sufficiency results, as well as characterization results that cannot be derived 
in a tournament study. Second, the analysis in Malcomson (1984, 1986) relies crucially on the 
existence of non-contractible information that is informative about the relative performance of 
every agent; there is no such restriction in our analysis. Finallv, Malcomson's analysis assumes 
that there are essentially an infinite number of agents employed by the principal, thereby 
suppressing the strategic interdependencies induced by rank-order tournaments. In our analysis, 
these interdependencies are explicitly considered, and the optimal interdependencies character- 
ized. 


* 'The model of influence costs is distinct from a model of collusion. The latter is discussed briefly in section VI. 
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FV. BASIC MODEL AND THE VALUE OF BONUS POOLS 


Weconsider a single-period model of an agency (firm) owned by a risk-neutral principal. The 
firm consists of two divisions, each managed by a risk and work averse manager. The utility 
functions for the managers of division 1 and division 2 are given by G(s) — V(a) and U(s) — V(a), 
respectively, where s is income and a is effort? We assume that G'(), U'() > 0; G^(), 
U^() «0; V'() > 0, and V^() > 0. Manager i's reservation utility for participating in the agency 
is Ui, 

Each manager chooses an action which is unobservable to the principal and the other manager 
and which generates a probability distribution over his own division's outcome. The set of 
possible outcomes for division i, i= 1,2, is binary, and is either success (X,,) or failure (Xp). These 
outcomes are jointly observed and verifiable by the principal and the managers, and are consumed 
by the principal after payment of compensation to the managers. The manager of division 1 
chooses action a, € [0,1], which results in personal disutility to him of V(a,). Conditional on 
action a,, the probability that outcome X „is realized is a,. Similarly, the second manager's choice 
of effort (a, €e [0,1]) leads to a personal disutility of V(a,) and induces, with probability a., 
outcome X.,. In addition, as a result of manager 2's action, a second, conditionally independent 
signal y, € {y,, Yy} is produced, with Pr(y,la,, X, X. ] =h(a,), where h'() > 0, 0 « h() < 1. These 
assumptions imply that y, is informative about manager 2's action choice, but imperfectly so. The 
realized value of y, is observed by the principal, but cannot be contracted on. Explicit contracts 
can be based only on the jointly observed and verifiable divisional outputs, (X,,X,). With no loss 
of generality, we assume that the managers cannot observe the y, realization. 

Note that the non-contractible information, y,, is a signal about manager 2's action, i.e., it is 
not consumed by the principal and has no direct effect on his objective function. It is also 
important to note that y, contains no information about manager 1's choice of effort. Further, the 
two divisions are functionally and statistically independent of one another. Therefore, observing 
the output of one division provides no information to the principal about the possible action choice 
of the other division’s manager.’ 

Before analyzing the potential usefulness of incorporating y, when it cannot be contracted 
upon, we first consider two benchmark cases, one in which y, is never observed by the principal 
and another in which y, is not only observable by all parties, but also can be contracted upon. 


Benchmark Case I: If the y, were never observed by the principal, the contracts offered to 
the managers would treat them as independent divisions. Each manager would be offered two 


5 For notational convenience, we assume that the managers’ disutility functions are the same; given our analysis, this is 
without loss of generality. 

$ Clearly, if y were only observed by the principal, not by the managers, it would be inherently non-contractible. Other 
examples of such non-contractible information include non-quantifiable or “soft” information such as: the principal’s 
personal observations of the manager's ability or effert level, information gathered through other informal channels, 
etc. Another example is a piece of information that would be useful for contracting but, if made public in order to make 
the contract enforceable in the courts, might be used by the firm's competitors to the detriment of the firm. Finally, our 
non-contractible information may be thought of as a proxy for unforeseen contingencies and events. Such contingencies 
can only be incorporated into the manager's compensation with some form of renegotiation, of which giving the 
principal the right to make discretionary payments is an extreme form. 

? We retain the assumption of divisional independence throughout this paper in order to emphasize the value of bonus 
pool arrangements even in such extreme settings. If the divisions were statistically or functionally dependent, there 
would be an obvious demand for the relative performance evaluation inherent in bonus pool arrangements. By assuming 
ete we ensure that any demand for the bonus pool arrangement arises solely to exploit the non-contractible 
information. 
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possible payoffs, one each for success (output X...) and failure (output X p). There would be norole 
for any non-trivial linking of the managers’ compensations through the use of a bonus pool, or 
for the use of discretion in compensating the managers. 


Benchmark Case 2: At the other extreme, if the y, realization could be contracted upon, the 
principal would optimally vary the incentive scheme for the second manager based on the 
realization of y,, in order to better motivate his choice of a. That is, he would specify four possible 
payoffs for the second manager by conditioning the payoffs for success and failure on the 
realization of y,. However, manager 1 would still face just two possible payoffs, i.e., his 
compensation would be independent of the realized y,. As in Benchmark Case 1, the optimal 
contracts would shield each manager's compensation from the outcome of the other division. 


We now analyze the scenario in which y, is observable only by the principal and therefore 
cannot be contracted on. In this setting, the optimal compensation structure may differ radically 
from the two benchmark cases discussed above. In particular, we show that there are strict benefits 
to the principal to linking the compensations of the managers through the use of a bonus pool and 
discretionary allocations. 

Consider the following compensation scheme. At the time of contracting, the principal 
commits to a total bonus pool for each possible vector of output realizations for the first and 
second divisions, (X.X). That is, he specifies four different total compensations for the 
managers, depending on whether the divisions succeed or fail. We assume that the compensation 
paid to each manager for every output realization pair, (X,,X,), is verifiable. The manner in which 
the pool is ultimately divided up between the managers is left to the discretion of the principal. 
Note that because y, cannot be contracted upon, the totai pool cannot be made a function of y,. 
However, the principal has the option of basing the individual payouts on the non-contractible 
information, y,, as well as the (X,, X.) realizations. 

Once the principal commits to a bonus pool, he is completely indifferent as to how to 
subsequently allocate it between the managers because the total compensation payment is fixed 
at that point. Therefore, arty allocation of the bonus pool is sequentially rational for the principal 
to carry out. Consequently, we assume that the principal allocates the bonus pool in whatever way 
would best motivate the managers ex ante.* The managers know this and choose their actions 
accordingly. Given this, there is no need to distinguish between a manager' s base pay and bonus; 
we thus refer to each manager's compensation as his total pay, and by total bonus pool we mean 
total compensation to the two managers. 

We now characterize the optimal bonus pool arrangement and study whether it is in the 
principals interest to make the allocations a non-trivial function of the observed signal, y,. Let 
B(X,, X.) denote the total bonus pool contingent upon division 1’s output being X, and division 
2’s being X,. Let the first manager's discretionary payoff allocation in this contingency, 
conditional on the non-contractible information being y,, k = L, H, be R(X,,X.,y,). The second 
manager's payoff in that case is B(X,,X,) - R(X,,X,,y,). To minimize notation, we often refer to 
B(X,,, X...) as Bogs R(X sY as Resp and so on. The optimal choices of the bonus pools and 
discretionary allocations are given by the solution to the following program: 


* Note that the principal will carry out the “optimal allocation" ex post even though be is the only one who observes the 
realized value of y,. This issue is discussed further in section VI. 


Baiman and Rajan—The Informational Advantages of Discretionary Bonus Schemes 563 


MODEL 1: 


Max — (afa|QC, + X4, - Bes) + (1I-2)00, + X, - Bpo) 
He ROBO + (Lea Da QC, + Xa- Ba) + (1-0) Kip X - BD} 


subject to: 


a, h(a [a GRs) + (1-8) G(ORS4] + (1-2) h(2)[a GR...) + d-a G(R)! 
+ a,(1—h(a,)) [a GRs) + (1-a, JGR. )] _ 
+ (1-a,)(1-h@,))[a, GG g) + (0720G(O g)] - Va) 2 Ut. (1) 


a, h(a)[a, UB a Rom) + (1-8)U(GB, - RAD] 
+ (1-a,)h(a,)[a, UB oR + (1-8)U(B, -R,I 
+ a, (I-h(2))[a, UB sR) + (1-a)U(B. Ri 
+ (1-a,)(1-h(a,))[a, U(B,-R,, ) + (l-a PUG r Re) - V(a) 2 U2. (2) 


a, h(a, [GR s — GORSU] + (159) h(a) IGO V) - GOV] 
+ a, (1-h(a,))IGRgg) — GOL] + -81-a DIG Ror) - GQ)] = Va). (3) 


(h(a,) + a, h’(a,)) [a, Uss -Resm + (1-a))JU(BA -Risg)] 
+ (—h(a,) + (1-a,)h(a,)) [a UB, Rom + (1-a JUG p Rprn)] 
+ (1 — h(a) — a, h'(a)) [la UB, Rss) + -a UB, Rra) 
+ (-1 + h(a,)- (1-aj)h (25) fa, UB, Roro) + 1-a UB, -R,)] = V(a,). (4) 


Constraints 1 and 2 are the Participation constraints for the first and second manager, 
respectively. Constraints 3 and 4 are the Incentive Compatibility (IC) constraints for the first 
manager and second manager, respectively. 

Two points about the formulation are important. First, the total payment to the managers, 
B(X,,X,), does not depend on the non-contractible information, y,. Second, manager 1’s 
compensation, R(X,,X.,,y,), potentially depends on y,. We solve for R(X,,X,,y,) as a function of 
y,, ex ante, as if the principal could commit to it before the managers take their actions, rather 
than having to choose it sequentially rationally, after the actions are taken. The principal is 
indifferent ex post as to how to allocate the pool, so he can allocate the pool in the way he would 
have chosen if he could commit to the ex ante optimal method, R(X, ,X.,,y,). Note that this is not 
the same as allowing the principal to ex ante contract on y,, in which case B() would also be a 
function of y,. Also notice that, in contrast to Malcomson’s (1984, 1986) analysis of tournaments, 


? For model 1 and all subsequent models studied, we assume that a non-trivial moral hazard problem exists for each 
manager, and hence that the multipliers for all the IC constraints are non-zero. Manager 1" s objective function is strictly 
concave in a,, and therefore, the first-order representation of his problem (constraint 3) is appropriate. Further, given 
the functional independence of the divisions, the results in Sinclair-Desgagne (1994) indicate that the first-order 
approach is valid for manager 2's problem as well. 
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each manager’s compensation potentially depends on the other manager’s performance, and this 
strategic aspect and ensuing zero-sum relation between the managers are incorporated into each 
manager’s IC constraint. 

Before analyzing the solution to the above optimization problem, note that the principal is 
never worse off by using the above payoff structure as opposed to completely ignoring y, and 
compensating the managers independently, because this option is still feasible in the above 
program. The more interesting issue is whether the optimal allocations to the managers take the 
realizations of y, into account in a non-trivial way under the bonus pool structure. Before 
answering this question, we demonstrate that bonus pool schemes are always valuable if the 
additional signal provides perfect information about a,. 


Observation. In Model 1, suppose that there exists a smooth non-contractible perfect signal 
y =E(a,), (O) #0. Then, there exists a bonus pool arrangement that leads to a strict Pareto 
improvement over the optimal contract that does not make use of y (Benchmark Case 
1).10 


The proof of this result constructs a bonus pool arrangement that maintains the second-best 
outcome for manager 1, while achieving the first-best outcome for the second manager, despite 
the fact that y is non-contractible. This is done by allocating the entire pool to the first manager 
unless y indicates that manager 2 chose the desired level of a,, in which case the latter is paid the 
first-best wage level. 

Under the perfect information assumption, there is no trade-off in equilibrium between the 
benefits of using y for improved contracting with manager 2, and the distortions this introduces 
into the contract with manager 1. We therefore restrict ourselves henceforth to analysis of the y, 
imperfect information setting, in which such a trade-off will arise. Even in this imperfect 
information setting, we now demonstrate that a strict Pareto improvement is achieved by the use 
of bonus pool schemes. 


Proposition I. In Model 1, the use of y, in a bonus pool arrangement results in a strict Pareto 
improvement over the optimal contract that does not make use of y, (Benchmark Case 
1). 


Thus, the principal can always achieve a strict Pareto improvement by making non-trivial use 
of the non-contractible information in a discretionary bonus pool.!! An implication of Proposition 
l is that manager 1, to whom the signal y, is pure noise, is being compensated based on y,’s 
realization and is made to bear at least some of the risk associated with it. This finding appears 
to violate Holmstrom’s (1979, 1982) informativeness result, but the thrusts of the two results are 
completely different. Whereas Holmstrom deals with the optimal use of contractible information, 
Proposition | studies the value of using non-contractible information in a discretionary manner. 
As we have discussed above, manager 1's compensation would never be made to depend on the 
signal y, if its realization were contractible. 

The Pareto improvement from the use of y, in a bonus pool arrangement arises because it 
improves the allocation of risk between the managers. If y, were contractible, the principal could 
make a Pareto improvement by using y, to shift some of the risk associated with X, from the risk- 
averse manager 2 to himself, without changing either manager 2's or manager 1’s action choices. 


10 All proofs are in the Appendix. 
' This is not to say that using a bonus pool is always the optimal mechanism for the principal. In section VI, we discuss 
some alternative ways of incorporating y, into the contracting framework. 
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Clearly this would be efficient even if the principal were risk-averse (this is Holmstrom’s 
informativeness result). 

With y, non-contractible, it can still be used within a bonus pool arrangement to improve risk- 
sharing without changing the action choices. But now the risk sharing must be done between 
manager 2 and manager 1, rather than between manager 2 and the principal. Without the bonus 
pool arrangement, each manager bears the risk associated with his own outcome, where the 
outcomes are independently distributed. Regardless of the relative risk-aversions of the two 
managers, a Pareto improvement can always be achieved if y, is used to allocate some of manager 
2's risk to manager 1 through the bonus pool arrangement. Of course, this risk-sharing between 
managers 1 and 2 both imposes more risk on manager 1 and introduces a competitive aspect to 
the relationship between the managers that otherwise would not be present. But manager 1 is 

locally risk-neutral for small amounts of risk, and hence the reduction in manager 2's risk 

premium more than covers the additional risk premium that must be paid to manager 1, as well 
as any distortion to the managers' incentives arising from the zero-sum relationship between them 
with respect to the non-contractible information, y,. 

While Proposition 1 demonstrates that a strict Pareto improvement can be achieved by using 
the non-contractible signal y, within a discretionary bonus scheme, it does not indicate the size 
of the benefit. Table 1 contains a numerical example for which the use of y, within a discretionary 
bonus scheme results in a 15.66% improvement in the principal's expected payoff relative to its 
non-use (Benchmark Case 1). Moreover, the bonus pool scheme produces 85.3696 of the benefit 
that would have been achieved had the signal y, been made contractible. This example suggests 
that using a non-contractible signal within a discretionary bonus scheme can result in potentially 
significant reductions in agency costs. 

The risk-sharing between the managers induced by the bonus pool arrangement is illustrated 
by the following first-order condition for R,,,,: 


see ye fe ERR 5) 
U (Brg — Resy) H (07 a,)— H a^ h(a, ) 


Notice that (5) is similar to the first-order condition for a single-manager problem with a risk- 
averse principal having a utility function for wealth of G(). In a single-manager problem, once 
the manager has chosen his action, the principal and manager have a zero-sum relationship with 
respect to wealth, and the optimal policy is to share risk depending on both the principal’s and 
manager’s utility functions. In our problem, once the managers have chosen their actions, the 
bonus pool arrangement induces a similar zero-sum relationship between them with respect to 
wealth. Therefore, how manager | (2) is compensated depends not only on his utility function but 
also on manager 2’s (1’s) utility function. Despite this, it is always valuable to use y, in a 
‘discretionary bonus pool arrangement to share risk between the two managers, regardless of their 
relative risk aversions, as long as there is a nontrivial moral hazard problem and y, is marginally 
informative about manager 2’s action choice. Moreover, we next provide results regarding the 
nature of the optimal bonus pool amounts and the manner in which the pool is allocated to the 
managers. 


Proposition 2. In Model 1, 
a) the total bonus pool is monotonically increasing in the divisional outcomes, i.e., 
B,; > B, Bss > B,,, Bor > Bpr and B,, > B,,; 
b) for any pair of output realizations, the manager of the first (second) division always 
receives a lower (higher) bonus when y,, is observed than when y, is observed; and 
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TABLE 1 
Optimal Compensation and Actions With and Without Bonus Pools* 


Manager 1 Manager 2 


Panel A: y, Non-contractible, No Bonus Pool (Exp. Profit=92.85) 


R, 1257.42 R, 394.00 
R, 646.25 R, 36.50 
a, 0.65 a, 0.81 


Panel B: y, Non-contractible, With Bonus Pool (Exp. Profit=107.39) 


Ria 1217.19 Bas-Resy 415.12 
Rag 1234.71 Ber Rog 131.87 
Ri 612.39 Bes Ren 405.74 
Ru 623.99 Bir Ras 129.64 
Ra. 1467.27 Bassa. 165.04 
Ra. 1360.04 Bor Rer 6.54 
Ra. 830.22 Buc Ras. 187.92 
Ry. 746.48 Bir Ra. 7.16 
a, 0.65 a, 0.83 


Panel C: y, Contractible (Exp. Profit=109.88) 


R, 1257.42 Rey 417.20 
R, 646.25 Ra, 141.55 
Ra. 141.55 
Ra 11.35 
a, 0.65 a, 0.83 


* This numerical example was generated under the following assumptions: G(-) = F035, 
U() = 245, vos 4) Ui =31; X4; = 1000, X, = 0; and h() — &;. 
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C) for any pair of realizations of the other division's output and the non-contractible 
signal, each manager's compensation is increasing in his own division's output. 


Thus the bonus pool available for compensation increases in both own-outcome and other 
division's-outcome (part a), which is both intuitively reasonable and consistent with observed 
practice. The question remains as to how y, affects the allocation of the bonus pool. Part (b) 
establishes that the principal always allocates a greater discretionary bonus to manager 1, at the 
expense of manager 2, when the non-contractible signal y, indicates that manager 2 may not have 
chosen the appropriate level of effort, a,. The opposite occurs when a signal of y,, is realized. This 
split is intuitive given the zero-sum nature of the bonus pool allocation, and highlights that 
manager 1 is acting as a budget-balancer for manager 2's compensation, providing an additional 
means for the principal to induce manager 2 to implement the principal’s desired choice of action. 
Parts (a) and (b) indicate that the bonus pool induces competition among the division managers 
with respect to y,, but not with respect to X. Finally, part (c) establishes that there is monotonicity 
of compensation in own-output, consistent with what one would expect in an agency setting with 
independent divisions. 

In summary, the results in Proposition 2 seem to indicate that the nature of the managers' 
compensations under the bonus pool arrangementis not very different from those under standard 
second-best contracts. The next set of results, however, highlight the extent to which the optimal 
payoffs under the bonus pool scheme differ from (i.e., are distorted relative to) the optimal 
second-best payoffs. To prove these results, we will need to make additional assumptions with 
respect to the managers' utility functions. 


Proposition 3. Suppose that manager 2's utility for money exhibits non-decreasing absolute 
risk aversion. Then, within the optimal bonus pool arrangement for Model 1, the payoff 
to manager 1 is conditioned on the outcome of the second division, X... 


While we cannot establish a similar result when manager 2's utility function exhibits 
decreasing risk aversion, numerical analysis indicates that the result holds in that case as well (as 
illustrated by the example in table 1). To understand the demand for this additional distortion, first 
recall that without the use of y, in a bonus pool arrangement, the compensation of manager 1 
depends solely on X, and the compensation of manager 2 depends solely on X... Thus, if there were 
no bonus pool arrangement using y, to compensate manager 1, there would be no demand for 
using X, to compensate him either. Because y, and X, are unconditionally correlated, compen- 
sating manager 1 on X, arises to mitigate some of the risk that is being imposed on him by basing 
his compensation on y,. 

Proposition 3 finds conditions under which manager 1's compensation depends on division 
2's outcome. Proposition 4 (below) finds conditions under which manager 2's compensation 
depends on division 1's outcome. 


Proposition 4. Suppose that manager 1’s utility for money exhibits either increasing or 
decreasing absolute risk aversion. Then, within the optimal bonus pool arrangement for 
Model 1, the payoff to manager 2 is conditioned on the outcome of the first division. 


The statements of both Propositions 3 and 4 depend upon conditions with respectto the other 
manager's utility function, as a result of the risk-sharing nature of the bonus pool arrangement 
explained earlier in the discussion of (5). Given that manager 1’s compensation is already 
dependent on y,, Proposition 4 finds conditions under which a Pareto improvement can be made 
by allocating some of the risk associated with X, from manager 1 to manager 2. 
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Despite the use of other divisions’ outcomes to compensate managers in the bonus pool 
setting, an important property of the optimal discretionary allocation scheme is that each 
manager’s payoff is not monotonic with respect to the other division’s outcome (X,). To see this, 
consider the following equality, resulting from the first-order conditions with respect to the 
managers’ compensations: 


b(a)[G Ren) -O Res] = [H-h(2)1IG (Ras) - G(a)]. (6) 


This equality implies that, given that the first division succeeds, if for some y, realization manager 
] receives a higher (lower) payoff when the second division fails, then exactly the reverse must 
hold for the other y, realization. Similar results hold for the case where the first division fails, and 
for the second manager's compensation patterns. These results demonstrate more clearly the 
interdependencies which the optimal bonus scheme creates. Despite the zero-sum nature of its ex 
post allocations, the bonus pool arrangement does not place the managers in strict direct 
competition with each other with respect to outcomes. Unlike duopoly settings or traditional RPE 
schemes, for example, each division manager here is not always better off if the other division 
fails. 


V. EXTENSIONS 


In this section, we show that our results extend to more general model specifications 
regarding the nature of the non-contractible information. Our base model (Model 1) assumed that 
there was no private information on which the managers chose their actions and that there was 
a single non-contractible signal which represented information only about the action choice of 
manager 2. One obvious generalization is to allow for a second non-contractible signal which 
represents information only about the action choice of manager 1. It can be shown that all the 
Model 1 results extend to this case in a straightforward manner; further, the intuition underlying 
them is the same as for Model 1. 

Another generalization (referred to as Model 2) is to allow the single non-contractible signal 
in Model 1 to be informative about both managers' action choices. Here we assume that 
Pr{y,la,,a,.%,,%,} =h(a,,a,), where h,C) #0, b () #0, and 0 « h() < 1. This formulation includes 
cases in which y,, is good news about both managers, bad news about both managers, or good news 
aboutone manager and bad news aboutthe other. For example, y, could represent some corporate- 
wide measure of performance (such as an aggregate signal of joint profitability or efficiency) or 
a measure of the corporate-wide use of some common resource. Proposition 5 establishes the | 
value of bonus pool arrangements in this setting. 


Proposition 5. 'The use of y, in a bonus pool arrangement in the Model 2 setting results in 
a strict Pareto improvement over the optimal contract that does not make use of y,. 


Thus, the zero-sum relationship with respect to y, between the managers, inherent in a bonus 
pool arrangement, is valuable even when y, provides information about both managers’ action 
choices. Notice that this result holds even when h,() and b, () have the same sign (in particular, 
even if h,() = h,()), in which case providing one manager with positive incentives based on y, 
results in providing negative incentives to the other manager. However, it is still valuable to do 
so because the principal can mitigate the negative externality to the second manager by also using 
the (X,, X.) outcomes to determine the allocation of the bonus pool. In support of this, it can be 
shown that, with suitable restrictions on the managers’ utility functions (similar to those in 
Propositions 3 and 4), the use of other division outcomes to compensate each manager is true in 
this model setting as well. 
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In the final generalization (referred to as Model 3) we assume that manager 1’s problem is 
exactly as shown in Model 1. However, we now assume that the manager of division 2, after 
contracting with the principal but prior to choosing his action, privately observes a binary signal 
on the efficiency of his division. The signal indicates that the division is in a state of either high 
productivity (y,,) or low productivity (y,). The ex ante probability of realization y,, is given by 
g € (0,1). The manager then chooses either action a,, or a,, € [0,1]. The choice of action a,,, k 
= L, H, induces a probability of a, that the high output X, is realized. The productivity 
realizations affect manager 2’s disutility of choosing any level of effort; manager 2 is able to exert 
effort at a lower total and marginal personal cost under y,, than under y,. We model this by 
assuming that the following conditions hold for manager 2: 

(i) V(a,y,) < V(a,y,) V a,; and 
(ii) V (a, y,) < V,@,y,) V a, 
where the subscripts denote derivatives. 

The major difference from Models 1 and 2, therefore, is that we now specify y, as a 
productivity parameter rather than as a signal about the action taken by the second manager. 
Further, while manager 2 now observes y, before choosing his action, the principal obtains 
knowledge of y, only after both managers' actions have been chosen (but before the managers 
have been compensated)." For example, y, could be information about the productivity of 
division 2's fixed capital or about demand conditions in division 2's industry. As before, we 
assume that y, is not verifiable information, and so cannot be contracted upon. 

As in the previous section, it is clear that if y, were either unobservable or contractible, the 
principal would never make use of it in evaluating the first manager. Nor would there be any role 
for compensating either manager based on the other division's output. However, with non- 
contractible information, we find that the principal engages in both forms of distortion of the 
managers' compensation, and is strictly better off doing so. Retaining our earlier notation, the 
optimization program for the principal in this setting is given as follows. 

MODEL 3 
Max { (gà, t (1—g)a,, )[a,(X,. + X. = B.) + (1-a (X, + X. m BI 
So faust ROBO 4 (e(1-8..) + (1-g)(1-2, Na, X, + X- Ba) + (1-2) (X, + X, — BI) 


subject to: 


ga; la, G(RSS + (1-84) G(Q,] + g(172,)[a, G(R A + (1-8) G(Q] 
+ (1-g)8, [a,G(R,,,) + (1-8) G(Q,U D] B 
+ (1-gX1-a, )[a,G(Rg, ) + (l-a PGR m)l- V(a,) 2 Ut. (7) 


ga, [a, U(B,,-R,, + (1-a PUB -Reeg 

g-a 9 [a, U(B,-R,,.) + (1-a)U(B, -R)] 

(1-g)a,, [a,U(B ,-R,, ) + (1-a)U(B,,-R,, )] 

(1-g)(1-a,, [a, UB, Ren) + (1-4)U(B,-R,, )] 

g V(&,, yg) — (1-g) Via, y,) 2 U2. (8) 


+ + +4 


"Jt is obviously necessary in this model that the second manager (although not the first) observe y, before his action 
choice because he must be able to vary his action choice based upon its realization. By contrast, in models 1 and 2, it 
is irrelevant whether the managers observe the realization of y,. 
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ga, (GR...) - GRygy)] + BCl-ay IG Rey) 7 GR ED] 
+ (i-g)a, [GR,,,) - GO] + (-g)(1-a, IG Rg) - GU] = V). (9) 


a [UŒ ss Res) — U(Bs- Rs] 


+ (1-a D[U (Bis -Reen - Uir Rel = Vay (10H) 
a [UBs Rss) - UB er Rer) 
+ (l-a,)[U(B,,-R,.,) - UB Re) = Vay) (10L) 


Note that we now have two IC constraints for the second manager’ s choice of action, one each 
for a,,, and a, , respectively. Our first result is the analog of Proposition 1. 


Proposition 6. The use of y, in a bonus pool arrangement in the Model 3 setting results in 
a strict Pareto improvement over the optimal contract that does not make use of y,. 


The optimal discretionary allocation of bonuses again conditions the payments to manager 
1 on the signal y,, which is completely uninformative about his own effort choice. As before, the 
benefit to the principal from better structuring the second manager’s incentives (through the use 
of the informative signal y,) exceeds the cost of the risk premium that must be paid to the first 
manager for the additional risk that is imposed from conditioning his compensation on y, as well 
as any from distortions to the managers’ incentives arising from the zero-sum relationship with 
respect to y,. The advantage to incorporating y, in the contract in Model 3 is that it enhances the 
principal’s ability to more efficiently motivate the separate actions, a, and a, . 

As in the previous models, we can also show that the optimal discretionary allocations result 
in other distortions in the way division managers are compensated. For example, as in Proposition 
4, the second manager’s compensation is made a function, not just of his own division’s 
performance and y,, but also of the first manager’s performance. Further, the following result 
establishes, without requiring the additional assumptions in Proposition 3, that manager 1’s 
compensation is contingent on division 2’s outcome in the current model. 


Proposition 7: Within the optimal bonus pool arrangement for Model 3, the principal 
conditions the payoff to the first manager on the outcome of the second division. 


Proposition 7 is stronger than Proposition 3 because of the greater productive improvement 
in the present setting (as compared to Model 1), resulting from the principal’s ability to separately 
motivate manager 2’s different action choices. As a result, there is greater distortion of manager 
l'scompensation based on y, (again, as compared to Model 1), which in turn creates a greater need 
for the filtering of his compensation using X... 


VI. CONCLUSION AND DISCUSSION 


In this paper, we demonstrated that the use of discretionary bonus pools can result in strict 
Pareto improvements under very general conditions and in various model specifications. The 
value of such compensation structures arises by enabling a principal to utilize non-contractible 
information in a sequentially rational manner, providing one rationale for the frequently observed 
use of such bonus arrangements. The use of such a bonus pool arrangement, while resulting in a 
strict Pareto improvement, also results in distortions to the managers' compensations that are at 
odds with traditional notions of informativeness and controllability. However, these distortions 
are consistent with observed group payment schemes where apparently independent divisions are 
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rewarded based on RPE or total firm profits. In this case, each division manager’ s compensation 
is being distorted by the uninformative performance of other divisions. 

Our results also have implications for the evaluation of managerial accounting information 
systems. Our analysis provides a relevant benchmark for establishing the value of a management 
accounting system which makes information verifiable that is already available but not contract- 
ible. In establishing the value of reporting information by the managerial accounting system, one 
needs to consider the gains obtained relative to using the information in a bonus pool arrangement, 
not necessarily relative to a setting where the information is not used at all. 

Our models assumed that there were only two possible outcomes for each division and only 
two divisions in the firm. However, the logic of our results and proofs extend to more outcomes 
and more divisions. In particular, a principal is never made worse off by including more divisions 
within a bonus pool arrangement. Further, because the distortion per individual division manager 
would be less as more divisions are added, a single large pool may be strictly preferred to multiple 
smaller pools. Consistent with this, Fox (1979) reports that the median number of participants in 
bonus pools for his survey of firms was 139. 

An interesting extension of our analysis is to consider a setting with a finite number of 
independent periods. Suppose that the principal must ultimately pay out the full amount in each 
period’s bonus pool, but does not necessarily have to do so in that period itself.'? In this case, the 
principal’s sequentially rational decision would be to defer payouts until the end of the game, 
which could adversely affect incentives in prior periods. It may therefore be in the principal’s 
interest to create a cost for such inter-period deferrals, such as by committing ex ante to pay 
interest on the deferred portion of the pool. This would ensure that, in some contingencies at least, 
a portion of the pool would be paid out in each period. 

If the non-contractible information were not used in our models, the compensation paid to 
each manager would be independent of that paid to the other manager, thereby avoiding questions 
ofimplementability. The bonus pool arrangement sets up a zero-sum game between the managers 
where each manager's compensation depends on the other division's performance; hence, issues 
of implementability (including the possibility of sabotage— see Lazear 1989) may arise and 
reduce the value of discretionary bonus pools. We have found sufficient conditions for Model 1 
that guarantee a unique equilibrium in the game between the managers (for example, when the 
disutility functions are quadratic in action), and hence implementability concerns do not arise. 
However, this is a difficult topic to analyze, especially given our assumption of continuous 
actions, and more general analysis of this issue can be addressed by future research. 2 

There may exist alternative mechanisms for incorporating y, into the contracting framework. 
In Model3, for example, we assumed that the principal contracts with both managers prior to the 
realization of y,. It might be preferable for the principal to contract with manager 2 alone, who 
in turn would use his knowledge of y, to contract with manager 1 (see Melumad et al. 1992; 
Melumad and Suehiro 1994). Of course, such subcontracting would be of no benefit in the first 
two models. 

Finally, our analysis depends upon the principal making the ex ante optimal allocations when 
he is ex post indifferent as to how to distribute the bonus pool. This assumption also underlies the 
rank-order tournament literature (Malcomson 1984, 1986). At least two reasons could be offered 
as to why the principal might, contrary to this assumption, not make the optimal bonus pool 


B Rox (1979) notes that 43 of the 100 plans in his survey provide for funds accrued in one year to be carried over to the 
next year. 
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distributions: (i) a taste for discrimination by the principal, or (ii) collusion between the principal 
and one of the managers. A taste for discrimination would reduce the value of bonus pools in the 
sense that the principal would be willing to forego at least a portion of the monetary benefits of 
having such an arrangement in favor of the non-monetary utility received from discriminating 
between the agents. Further, collusion between the principal and one of the managers might lead 
to the principal's acting in a manner different from what we have assumed. Both forces could 
reduce the principal's ability to utilize non-contractible information through a bonus pool 
arrangement. Analyzing these issues is beyond the scope of this paper, but they provide 
interesting avenues for future research. 


M For collusion to be sustainable, there must exist a self-enforcing collusive arrangement between the principal and one 
manager. Such arrangements are extremely difficult to identify in any setting, and very few examples exist in the 
economics literature. In general, most analyses of collusion assume that collusive agreements are costlessly self- 
enforcing (see, for example, Suh 1987 and Tirole 1986). To see the problem of finding a self-enforcing collusive 
agreement in our Model 1, consider the following two approaches. In the first case, assume that the y signal favors 
manager 1. For collusion to occur, the principal, before allocating the bonus, would have to go to manager 2 and tell 
him that the y favored manager 1, but that the principal would allocate the bonus as if the signal favored manager 2, 
provided the latter splits the additional compensation with the principsl. Manager 2 could agree. Then, the principal 
would pay out the entire bonus pool consistent with the realized (X,, X,) and his agreed-upon collusive arrangement. 
Note that because the payments are verifiable, the principal has to pay out to the managers the total amount of the bonus 
pool consistent with that (X,, X,) realization. In particular, he cannot retain the “bribe” which manager 2 agreed to pay 
him. However, once the managers are paid, manager 2 would have no incentive to return the agreed-upon bribe amount 
to the principal. Anticipating this, the principal would therefore not try to bribe manager 2. In the second approach, the 
principal can approach manager 2 and promise that, in return for an up-front bribe, the principal would allocate the bonus 
pool consistent with the signal y favoring manager 2. At the time the principal made the allocation, he would clearly 
be indifferent between acting obediently and collusively; therefore, the proposed collusive agreement is not unreason- 
able. However, if the principal could make such an agreement with manager 2, he would be even better off making a 
similar agreement with manager 1 also, and collecting both bribes up-front. Because collusive agreements are not 
enforceable by the courts, and are not announced publicly, the principal could claim, without fear of contradiction , that 
he had not previously made a similar agreement with manager 2. Therefore, neither manager would view the agreement 
as credible and, hence, neither would pay the bribe. 


APPENDIX 


Proof of Observation 


Suppose that the principal's desired first-best action from manager 2 is given by a,*, and let 
S* satisfy U(S*) - V(a,*) = U2. 


Also, denote the first manager's second-best wage contract as s,(X,). 
Consider the following bonus pool scheme: B(X,,X,) = s,(X,)+S*, 


s(X,)+8* ify #&(a,*) 
ae RORA ies ify=E(a,*) ` 


The second manager thus receives S* if he chooses a,* and nothing otherwise, implying that he - 
will choose a,* and collect his first-base wage. In equilibrium, the first manager therefore faces 
compensation scheme s, (X,) and so chooses his appropriate second-best action. It is clear that the 
principal strictly prefers this to ignoring y and implementing second-best behavior on the part of 
both managers, thus completing the proof. 
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Proof of Proposition 1 


Before proving Proposition 1, we first provide two lemmas which will be of use subsequently. 
We denote by u, the multiplier on constraint j, j = 1,..,4. 


Lemma 1. The multiplier on the second agent’s IC constraint is positive, i.e., 1 > 0. 
Proof of Lemma 1 


Suppose not, i.e., 4, <0. If p, 2 O, itis easy to show that manager 2’s payoffs are independent 
of the y, realization, as well as his own divisional outcome, indicating that his optimal action 
choice is a, = 0. For any interior action choice, it therefore follows that u, # 0, in particular 
u, « 0. 

Consider the first-order conditions for R,,,, and Rog: 


NEGO UNE NE, NN II 
U'(Bss — Rss) exa a,  h(a,) | din 
LIONE uL. Mr o i Ha) (A2) 
U'(Bss-Rsg) (H48, +H, a, 1-h(aj) 


As p, < 0, h'() > 0 and 0 < h() < 1, (A1) and (A2) imply that any interior solution is 


characterized by Rosg > Ryst: 
Similar analysis of the first-order conditions for the other six payoffs yields the result that: 
For ij =S,F, Ry, > Ry. (A3) 
Now, the first-order conditions for R,,,, and Rop are given by: 
G’Rgw) — | 8 acci h’(a,) (A4) 
U'(Be -R;g) (4a, +H l-a,  |h(a;) 
C'R) ([ & m -P - Lgh'(a,) (AS) 
U(Bs-Rgg) Ha +H l-a, i-hí(a) 


As |L, < 0, (A1), (A2), (A4) and (A5) together imply: 


Ron) Rew). ang Rss). GO Rep) ao 
U'(Bss — Rss) U'(Bg, — Rs) U'(Ba — Rsa ) U'(Bs, — Ra ) 


Also, from (A1), (A2), (A4), (A5), and the first-order conditions for B,, and B,_, we have: 


ha DiG (Rs) -O Ros] = H-a (Rs) - G'(ass)] (A7) 


(A7) implies that it must be the case that either R,,, > Resu or Rg > Ry. 
If the former is true, then the first inequality in (A6) implies: 


Boo Roca < B Rer => Bos Rast Resi: = Bac Ragrt Rey = Bss x: Bor 
If the latter is true, using the second inequality in (A6) similarly implies that B.. < Bap 
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We have thus shown that: p, < 0 = 


B., < B... (A8) 
Also, using the inequalities in (A6), one can derive the result that: 
B,,-R.a < Bor Ren for k=L, H. (A9) 


Analogous to the above analysis, one can use the following inequalities (given p, < 0): 


LOG Gs) Rs) and — um) $F Row) | 
U'(Bs — Resi) U’ (Bsr 7 Romi) U'(Bss 7 Rss) U'(Bg. - Ra) (A10) 


as well as the fact that either Ry > Rg, or Rz, > Rg, to derive: 
B, < Bj, and Beg Rysy < Ber Reg for k=L, H. (All) 


The final step in the proof is simply to recognize that the combination of the results derived in 
(A3), (A8), (A9) and (A11) imply that the left-hand side of constraint (4) is strictly negative, 
which contradicts the fact that V'() 2 0. Thus, it must be that u, > 0. 


Lemma 2. The multiplier on the first agent's IC constraint is positive, i.e., JL, > 0. 


Proof of Lemma 2 
Similar to that of Lemma 1. 


To prove Proposition 1, note that any set of contracts that ignores the realization of y, can be 
replicated through the bonus pool arrangement by setting Ry = R, fort = SS, SF, FS, and FF. It 
is thus sufficient to show that for any interior solution, the payoff to the first manager for some 
vector of output realizations varies as a function of y,. 

But from Lemma 1, we know that jt, > 0. From (A1), (A2) and the fact that h'() > 0 and 0 
< h() < 1, we thus know that any interior solution is characterized by Resy < Resy, giving us the 
desired result. Therefore, the bonus pool has a strict positive value attached to it. 


Proof of Proposition 2 


First, analysis of the first-order conditions for the payoffs for manager 1 immediately yields 
the result (given u, > 0) that: 

For i, j = S, F, Ry «Ry. 
Next, use the fact that the reverse of the inequalities in (A6) and (A10) must hold if p, > 0. Also, 
we again have either Ropy > R,,,, or Ror, > Ry, ; further, either Rz, > Risa OF Rug, > Ry. A proof 
technique identical to that used in Lemma 1 then establishes that B,, > Bor and B,.-R.., > B 
Ry, fork=L, H. Using the same logic on the other sets of payoffs, we can show that B > B... (thus 
proving that the bonus pool is monotonic in each division's outcome) and also that B,.—R.... > Bep- 
Rp for k-L, H, (thus establishing that manager 2’s payoff is always monotonic in his divisional 
outcome given any outcome for the first division). Finally, using Lemma 2, we can order the 
payoffs of manager 1 in an identical fashion to reveal that Rom > Ren and Ras Pg for k=L, H, 
i.e., manager 1's payoffs are also monotone in his division's outcomes. 


Proof of Proposition 3 
Suppose not, i.e., suppose that Rom = R,,, (= Rgp say), and Ris = Ren (= Ry, say), fork = 


bd 
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The first-order condition for R,_, is given by: 


GG) _(_ xu EX 
U"(Bgp — Repy) tres Ce l-a,  h(aj) ud. 


Comparing this to (A1) and setting Rosu = Rg, = Rg, yields 


H4 , Hah’) 


LL, + 
UBg-Rg) (^ & Ha) J (A13) 
U'(Bss 7 Rsp) ine H4 , 4B (27) 

? ]-a,  h(a,) 


An identical analysis of the first-order conditions for R,, and Rpg, (= Ra) yields: 


LN re) 
U’(Bep = Rg) E a, l- h(a.) 1 


í Aca a (A14) 
U (Bss — Rg ) ` H4. Ugh") 
^ ]-a, 1-h(a,) 
U(B.-R4) U'(B,-R.) 
It can be shown that RHS[(A13)] < RHS[(A14)], implying that ——— E « — —53— 3, 


UBs Rou) U'(Bs -Rg) 


Now, (A13) > 1 implies that B p < B,,; also. from the proof of Proposition 1, Roy < R4. But this 

implies that U(Bg; -Rsg +0) | U(Bsg ^ Rg t€) ~wherec<OandB,.—R,,,>B,,—R,,, which 
U'(Bss = Rsa) U’(Bss E Rg ) 

cannot hold when U() exhibits nondecreasing absolute risk aversion. 

Proof of Proposition 4 


Suppose not. This implies that for any realization of y,, manager 2’s payoffs for success or 
failure are independent of division 1’s outcome. In particular, we have 


Bss — Rese = Bus — Risa 
and Bos — Rss, = Bus — Rog. 
Together, these equalities imply that Rosu - Resy = Reg — Reg, = d, say. 
Now, the first-order conditions for Risu and R,,, are given by: 


GR [a p efe (A15) 
U(Bg—-Rgg) ((G-2a))-7H, a, ` h(aj) 


vim ees] ; IE (A16) 


U'(Bg Res) (m0-a,)- H a, 1-h(a,) 
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Combining these with (A1) and (A2) and using the hypothesized equalities, we have 
GR) Gas) ( a, aa | m 





G'(Re) G'(Ra) Ha + H, 1- a, 
This implies that we have Sho 2 B rs +d) l 
G (Risg) G'(Rag) 


where d > 0 (from the above inequality) and Risu < R,,, (from (A15) and (A16)). 

But this leads to a contradiction when G() exhibits either increasing risk aversion or decreasing 
G'(R +d) 

risk aversion, as EN would then be a monotone function of R. 


We thus have the desired result. 
Proof of Proposition 5 


The optimization program for Model 2 is the same as that for Model 1 except for replacing 
h(a.) by h(a,,a,) and h’(a,) by h,(a,,a,). In addition, Constraint 3 is replaced by the following 
constraint: 

[a,b C) + hO] fa,G(R,,,) (1.789) GGQ$)] 

+ [-a;h, CO) + 1 - hOT Taj GOV + (78) GG )] 

+ [(1-a8)h,O — hO] [a GG) + (01-2) 6(Q ey) 

+ [- (1-a)h,O) — 1 + hO [aG Rig) + -aG R )] = Va) (3°) 


As in Proposition 1, it is sufficient to show that in any interior solution, the payoff to the first 
manager for some vector of output realizations varies as a function of y,. 

Suppose not, i.e., R,, = R, fort = SS, SF, FS, and FF. 

Denote the multiplier for constraint j, j=1,..,4, as 4. 

Then, from the first-order conditions for R,,,, and , we have the following equality: 


G(Rs) — CR) 


R ENG 6» uno Nm OORT 
SSH SL CU U Bas- Re) U (Beg — Res) 


h h 
0585 t 6, 0,8, Pu $3, +94 0,2; —- 





id h 
a, +O; t 0,8, = $a, +0; — 9,8, qe 


a,h a 1 
PUE a ee T (A17) 
(ġa, +93) O,h. 05a, + 4 $, +—4 
ao 
From analysis of the first-order conditions for Roeg and Rey, we have: 
h h 
o,(1—a,)-, 6,0—a,) — $,0—2a,)-9, —6,0—a,) —-— 
7 . 1-h 
Rory = Rep, © = h, 
$ia; +O; * 6,8 h 12; +93 — 938, Ia 
$38,h, = l-a, CE (A18) 





€» ——————— 
(6,2, + 064) >,h, $,(1—a,)— 6, $, — $9, 
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Equating (A17) and (A18), we see that both conditions can hold if and only if $, =0, i.e., the 
second manager's IC constraint is not binding, which cannot be true at optimum. 
Therefore, we have a contradiction, thus proving the result. 


Proof of Proposition 6 


Denote by A, the multiplier on constraint j,j=1,.,5. | 

We use the same proof technique employed for Propositions 1 and 5. 

Suppose that the bonus payments do not vary in y, i.e., Rg =R, fort = SS, SF, FS, FF. 
Consider the first-order conditions for the payoffs when the first manager’s division succeeds: 


G’ Roa) | à lf.) | 











U'(Bs-Rgg) (Aa *À4À ? laggy (A19) 
As 
Vi me mm US A17 l- az; (A20) 
ORs) are A te 
U’(Bec Rea) (Aa ? a (A21) 
SS SSH 14 2H 
28s) fg A 
U’(Bog —Reg ) (Aa +À Aor (A22) 
À À A, l-a 
From (A19) and (A20), Ry, = R,,, implies that T24 = del = 
2H 2L 5 2L 
À As. X, a 
Similarly, from (A21) and (A22), Regu = Reg, implies that ; 77,1605 747 
2H 2 


For these two equalities to hold, it must be the case that a,, — à, . 
However, R „= R, for t = SS, SF, FS, and FF implies that the LHS of constraints (10H) and (10L) 
coincide, in turn implying that V (a pY) = V,.(a4 ,y,) — 84> 4, (as V (a,y,) < V,(a,y,) and V.C) 
» 0). We thus have a contradiction. 
Proof of Proposition 7 

We first prove the following lemma. 


Lemma 3. If, at optimality, manager 1's payoffs are independent of division 2's outcomes, 
then à,  &,. 


Proof of Lemma 3 


Suppose not, i.e., manager 1’s payoffs do not depend on division 2's outcomes, and we have 
ay = 8&4. Now, it must be the case that either À | < À., À , >A, ord, = X... We show that each one 
of these cases leads to a contradiction, thus proving that a,,, # a- 

(i)Suppose that À, < À.. a - RHS of (A21) « E" of (A22), implying that 
G’ 
OR) 


= Ray > R 
U'(Bs —Regy)  U"(Bgg z ns PS 
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G'(R s) G’(Ren ) 

eit OS O ans WU MBLA — 

cs = > —R CR 
Similarly, RHS (A19) > RHS (A20) U'(Bg, -R ) U'(B,, E Ror) SFH SFL 


But this implies that Rooy = Reorg and R,, = Ren, cannot hold, which is a contradiction. 

(11) Suppose that À, >A , Lhen, a similar argument shows that Regy < Res» and Ropy > Rea» 
which again contradicts Rosy = Rory and Rog = Rg, - 

(iii) Suppose that A, = À.. With À, = À, and & y =a,,, the first-order conditions indicate that 
manager 1’s payoffs are independent of y,. But this implies (from constraints (10H) and (10L)) 
that V (8,,y,) = V,(&, ,y,) => ay > a, which is a contradiction. 


Returning to the proof of the Proposition, suppose not, i.e., suppose that manager 1’s payoffs 
do not depend on the outcome of division 2. In terms of our notation, this implies that RL, = Rest 


(= Ra say), and Ris = Ror (= Rpp say), fork = L, H. 
Consider the first-order conditions with respect to the choice of B, and B,,: 
g D. tA] U'(Bs- Ras) + (1-8) Doa + AJU (BR) = ga + (1-g)8, 
g[A,(1-a,,,) — AJU (B o Ras) + (1-g)[A,(1-a,,) - AJU Bo Ror) 
= g(l-a + (1-g)(1-a,) 
Combining these with (A19) — (A22) yields the following equality: 


C= ayy IG" Rom) + = 8) = ag, )G(Rg,) _ Ba;4 Gag) 7 25 G (Gag) 
ga G3 a y) + (1 Hu g) z 8, ) $855 + (1 aum g£)85 (A23) 


An identical analysis of the first-order conditions for Bp and B, along with the first-order 
conditions for the payoffs when the first manager's division fails, yields the equality: 


g-a pO R n t-g -aa )G(Rg )  ga;4G'(Rgg,) + (1— g)a; G(R eg) 
= (A24) 

g-a) t-ga) gazy + (1- g)à, 

Now, if Ron = Resy = Ry» (A23) reduces to: [G Run — G(R, )][a,, — a, ] = 0. 

Similarly, (A24) simplifies to the following: [G Rip - G'(R, )][8;,, — &j ] = 0. 

Asa,,,* a, (from Lemma 3), it must be thatR,,=R,, and Reg = Ry But this implies that manager 

1’s payoffs are independent of the y, realization, which contradicts Proposition 1, 

: Thus, at optimality, manager 1’s payoffs must depend on division 2’s outcome. 
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L INTRODUCTION 


his study examines the empirical association between trading volume and belief revisions 
that differ among individual analysts. Accountants’ interest in this association stems in 
part from interest in interpreting volume reactions observed in financial accounting 
studies (e.g., Beaver 1968, Morse 1981, Bamber 1986). It has been suggested that the trading 
volume around earnings announcements arises because belief revisions differ across investors. 
Beaver (1968, 69), for example, offers the following intuition: “An important distinction between 
price and volume tests is that the former reflects changes in the expectations of the market as a 
whole while the latter reflects changes in the expectations of individual investors.” More 
formally, Karpoff (1986) identifies two distinct theoretical links through which changes in the 
expectations of individual investors can stimulate trading volume: 


[The] links of volume to information [provide] a rationale for the use of volume in 
event studies. ... Unusually high volume can result from heterogeneous reactions to the 
information, but it does not necessarily reflect disagreement among traders; it can also 
reflect consensus among traders with diverse priors (Karpoff 1986, 1084). 


Several accounting studies investigate the link between trading volume and diversity in 
investors’ prior beliefs by using the level of dispersion in analysts’ earnings forecasts (i.e., the 
cross-sectional variation in forecasts) as a measure of diversity in prior beliefs (Comiskey et al. 
1987; Ajinkya et al. 1991; Stickel 1991; Atiase and Bamber 1994; Kross et al. 1994; Bamber and 
Cheon 1995). Most of these studies document a positive association between forecast dispersion 
and trading volume.! For example, Ajinkya et al. (1991) examine the trading volume associated 
with an almost continuous flow of information to financial markets and find a positive association 
between monthly forecast dispersion and firms’ percentage of outstanding shares traded.? In 
contrast, there is little direct empirical evidence concerning Karpoff's prediction that trading 
volume is caused by both diversity in prior beliefs and differential belief revisions. Similar links 
between trading volume and differential belief revisions are also suggested by most of the 
theoretical trading volume research in the accounting literature (e.g., Jang and Ro 1989; 
Holthausen and Verrecchia 1990; Kim and Verrecchia 1991; Dontoh and Ronen 1993). 

Some prior research can be interpreted as studying the association between trading volume 
and differential belief revisions. Ziebart (1990) and Lang et al. (1992) examine changes in 
analysts' forecast dispersion around earnings announcements and find significant positive 
associations between dispersion changes and measures of trading volume. Yet, these studies do 
not provide evidence that dispersion changes reflect a different influence on trading volume than 
prior levels of dispersion, since they do not include prior dispersion levels in regression models.? 
Further, changes in forecast dispersion may not reflect the differential belief revisions referred 
to by Karpoff (1986). For example, Lang et al. (1992) use squared changes in forecast dispersion. 


! Stickel (1991) finds no significant relation between forecast dispersion and trading volume after controlling for price 
changes (although Atiase and Bamber (1994) and Kross et al. (1994) do find a positive relation between forecast 
dispersion and trading volume after controlling for price changes). 

2 Ajinkya et al. (1991) use contemporaneous forecast dispersion in their study. Karpoff' s (1986) model only suggests an 
association between contemporaneous dispersion and trading volume to the extent it reflects differences in investors' 
prior beliefs or differential contemporaneous belief revisions. 

? Ziebart (1990), for example, uses an event study methodology in which change in firm-specific trading is the dependent 
variable. The level of differential prior expectations is not included in his regression models because Ziebart does not 
expect it to impact the change in firm-specific trading across a two-weck event period. 
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This measure assigns a decrease in dispersion the same value as an increase in dispersion of the 
same magnitude. Thus, it may reflect either disagreement or consensus among analysts with 
diverse prior beliefs. Ziebart (1990) avoids this ambiguity by using changes in dispersion per se. 
Yet, Ziebart’s measure may not reflect changes in the relative position of individual expectations 
within the distribution of all expectations (e.g., two forecasts that swap positions). According to 
Karpoff (1986), it is this type of differential belief revision, or “jumbling” of expectations, that 
stimulates trading volume. 

The changes in forecast dispersion used in prior studies are also likely to be biased. Recent 
empirical analysis of detailed forecast data suggests a spurious association between financial 
news and increases in forecast dispersion, which simply results from outdated forecasts that do 
not move (Brown and Han 1992; Stickel 1995). Further, Abarbanell et al. (1995) argue 
analytically that inferences from studies using changes in forecast dispersion are threatened by 
the failure to control for the magnitude of price changes. The magnitude of price changes controls 
for the information (or informedness) contained in the average belief revision, information that 
is likely to be correlated with both trading volume and changes in forecast dispersion. 

This study tests the predictions that (a) differential belief revisions stimulate trading volume 
and (b) differential belief revisions exert a different influence on trading volume than diversity 
in prior beliefs. As in Ajinkya etal. (1991), lexamine monthly trading volume thatis not restricted 
to months containing formal accounting events/disclosures. Yet, this study differs from Ajinkya 
et al. (1991) in three important respects. First, I use the correlation between the relative positions 
of individual analysts’ current and prior months earnings forecasts as an empirical proxy for the 
degree of differential belief revision. The aim of this correlation measure is to capture Karpoff's 
notion of “jumbled” expectations. Second, multiple disagreement measures are included in a 
multiple regression to provide evidence on whether jumbled forecasts reflect different influences 
on trading volume than the dispersion in prior forecasts. I also use the change in dispersion 
measure introduced by Ziebart (1990) as a supplemental measure of disagreement, since forecasts 
may become increasingly dispersed without changing their relative positions (1.e., without 
jumbling). Third, I address a concern that the association between trading volume and disagree- 
ment proxies may be due to either stale forecasts or price effects by (a) constructing disagreement 
measures using detailed I/B/E/S (Institutional Brokers' Estimate System) data that are purged of 
potentially stale forecasts, and (b) controlling for the magnitude of price changes. 

Empirical results are consistent with Karpoff' s prediction that trading volume is associated 
with both differential prior beliefs and differential belief revisions. In multiple regression 
analysis, estimated coefficients show that the monthly percentage of equity shares traded is 
associated with (1) high levels of dispersion in the prior month's forecasts, (2) increasingly 
dispersed forecasts and (3) low correlations between analysts’ current and prior forecasts (i.e., 
heterogeneous forecast revisions). Additional analysis suggests an association between trading 
volume and all three disagreement measures, even after controlling for price changes, market- 
wide trading and firm size. These findings add to prior empirical evidence suggesting that trading 
volume is associated with dispersion measures, while also demonstrating an additional link 
between volume and the correlation between current and prior forecasts (a proxy for jumbled 
beliefs). 

Accountants' interest in differential belief revisions extends beyond understanding trading 
volume. For example, the correlation between current and prior forecasts is potentially useful for 
interpreting changes in forecast dispersion. Forecast dispersion reflects analysts' average 
“informedness” (i.e., uncertainty) as well as information asymmetry across analysts (Barron 
1993; Abarbanell et al. 1995). Thus, decreases in forecast dispersion may arise from either 
publicly or privately observed news, because both types of news increase average informedness. 
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Yet, if the news does not jumble forecasts, then this suggests that it is commonly observed and 
interpreted. In other words, similar belief revisions are likely to result from news that is similarly 
observed and interpreted." Thus, the correlation between current and prior forecasts may be useful 
in future studies that examine specific types of events (e.g., earnings announcements) to assess 
whether they are commonly interpreted. Accountants' interestin this type of examination extends 
to the broader issue of producing financial reports that communicate clearly. 

The next section discusses testable hypotheses suggested by Karpoff (1986). Section III 
discusses sample selection and research design issues. Section IV presents and analyzes results. - 
Section V concludes with summary comments. 


II. DEVELOPMENT OF HYPOTHESES 


Ibase hypotheses on the assumption that disagreement among analysts reflects disagreement 
among investors. More specifically, I assume that different positions and changes in individual 
analysts' earnings forecasts are reasonable proxies for different positions and changes in 
individual investors’ bid and ask prices. This assumption is consistent with empirical evidence 
suggesting that investors use analysts' earnings forecasts as important inputs when evaluating 
firms (Givoly and Lakonishok 1984) and that earnings forecasts are among the most important 
determinants of analysts' buy/sell recommendations (Previts et al. 1994). 

Before considering the association between trading volume and disagreement, itis important 
to recognize that trading volume reflects many types of changes in investors' portfolios. 
According to Karpoff's model, portfolio changes result from a reordering or “jumbling” of 
investors' bid and ask prices (or simply demand prices). This jumbling can result from non- 
informational factors such as changes in investors' consumption preferences. Karpoff's model 
suggests that some of this non-informational, or liquidity, trading will occur at all times because 
investors' demand prices constantly change due to idiosyncratic consumption needs and portfolio 
rebalancing. 

This paper focuses on the trading volume that results from information arrival. Karpoff's 
model suggests that information induces trading volume in two distinct ways. First, trading results 
when contemporaneous news causes investors to react heterogeneously. In other words, one 
cause of jumbled demand prices is the differential changes in investors' expectations that occur 
when investors have different interpretations of public information. Second, trading is predicted 
when investors with different prior period beliefs observe commonly interpreted news. In 
essence, commonly interpreted news serves to resolve disagreement between investors who hold 
long and short positions in a stock. Although the resolution of prior disagreement does not jumble 
investors expectations, it does create an incentive for investors to discontinue holding purely 
speculative positions. Thus, Karpoff's model suggests that commonly interpreted news is 
associated with the trading volume that results as investors close out their speculative positions 
(1.e., consume or invest in other securities). 

Figure 1 illustrates three different empirical proxies for investor disagreement. The first is the 
Pearson correlation between individual analysts' forecasts made in the prior period and corre- 


* Although the interpretation of similar belief revisions is relatively clear, the interpretation of differential belief revisions 
is more ambiguous. Differential belief revisions may reflect the influences of differential (Le., private) information, 
differential interpretations, or common interpretations among investors/analysts possessing private information of 
varying precision (i.e., varying reliability). Each of these potential causes of differential belief revisions has been linked 
theoretically with trading volume. 


Barron—Trading Volume and Belief Revisions That Differ Among Individual Analysts 585 


sponding forecasts made in the current period. * This is the disagreement measure introduced in 
this study. For example, the differential forecast revisions occurring from t=0 to t=1 in figure 1 
can be measured using the correlation between forecasts at t=0 and t=1. Although Karpoff's 
model suggests that differential belief revisions stimulate trading volume, his model does not 
necessarily imply that increases in dispersion cause more trading volume than decreases in 
dispersion.® The correlation between current and prior forecasts is useful because it is not 
dependent on changes in the overall distribution of forecasts and more directly proxies for 
Karpoff's concept of jumbled expectations.’ 

The second measure of disagreement, forecast dispersion, is measured at a point in time. For 
example, in figure 1 forecast dispersion is equal in periods t=0, 1 and 3. Karpoff' s model suggests 
that diversity in investors' prior beliefs can stimulate trading volume around contemporaneous 
news. Dispersion in analysts’ earnings forecasts is used by Atiase and Bamber (1994), who 
confirm the prediction that volume reactions around earnings announcements are positively 
associated with dispersion in forecasts prior to announcements. 

The third measure of disagreement is the change in forecast dispersion from one period to the 
next. In figure 1, the change in dispersion from t=1 to t=2 suggests agreement, whereas the change 
from t=2 to t=3 suggests disagreement. This measure of disagreement is the focus of Ziebart 
(1990). Although Karpoff's model suggests that trading volume can be associated with either 
divergence or convergence in beliefs, empirical results reported by Ziebart (1990) suggest that 

‘trading volume around earnings announcements is associated with divergence rather than 
convergence in forecasts. Ziebart argues that changes in forecast dispersion capture contempo- 


- 3 The Spearman correlation of analysts’ forecasts was also measured as: 
rho = 1-4(6-Xd?y(uf. (nf?-1))}; 
where d is the difference between the i* qualifying analyst's current and prior month's forecast rank, and nf is the number 
of quali forecasts. The Spearman measure places more emphasis on the reordering of forecasts. I do not report regression 
results using this measure because (1) it does not reflect as many conceptual forms of differential belief revisions as the Pearson 
measure, (2) itis highly correlated with the Pearson measure and (3) results using the Spearman measure are virtually identical 
to those using the Pearson measure. 
é Dontoh and Ronen (1993) provide another example of a model in which trading volume is not necessarily affected by 
whether changes in dispersion are increases or decreases. 
to forecast dispersion measures, the correlation measure introduced in this paper may be a more direct 
measure of Holthausen and Verrecchia’s (1990) consensus construct (see note 24). 
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raneous disagreement that causes trading volume.* If so, then changes in forecast dispersion also 
may explain trading volume that is incremental to that associated with the correlation measure 
introduced in this study. Notice that the relative positions of current and prior forecasts are highly . 
correlated across periods two and three in figure 1, although there is a form of heterogeneity in 
these forecast revisions (i.e., divergence). Provided that analysts' forecasts of earnings are 
reasonable proxies for investors' beliefs about underlying security values, this analysis suggests 
the following interrelated research hypotheses: 


H1: The correlation between analysts’ current and prior period forecasts is inversely related 
to trading volume after controlling for other volume-related effects. 


H2: Thelevelofprior dispersion in analysts' forecasts is positively related to trading volume 
after controlling for other volume-related effects. 


H3: Change in analysts' forecast dispersion is positively related to trading volume after 
controlling for other volume-related effects. 


In addition to disagreement variables, I use three other variables in alternative tests to control 
for factors beyond the scope of Karpoff's model. The first control variable is the absolute 
magnitude of stock returns. Many empirical studies document a positive association between 
trading volume and the absolute magnitude of price changes (see Karpoff 1987 for a review). 
Further, Abarbanell et al. (1995) show analytically that the magnitude of price changes represent 
an important control in tests of the effects of disagreement.’ Their model suggests that the average 
belief revision causes trading beyond that caused by disagreement. In essence, price changes 
control for the information contained in the average belief revision made by investors. The second 
control variable, market-wide trading volume (NYSE), is used to control the trading effects of 
events such as shifts in consumption preferences, changes in interest rates and speculation in other 
securities. Market-wide volume controls for liquidity trading thatresults from such events, as well 
as the corresponding speculation (i.e., disagreement) about economy-wide factors. The third 
control variable is firm size, measured as the total market value of equity (Ziebart 1990 also uses 
this variable). The influence of firm size on the association between disagreement and trading 
volume is examined because of the widely held belief that more news is commonly observed 
about large firms than small firms (e.g., Demski and Feltham 1994). If so, then measures of 
disagreement and trading volume are likely to be smaller for large firms than for small firms. 


III. RESEARCH METHOD 


Disagreement proxies are based on sell-side analysts’ forecasts of one-year-ahead earnings.!? 
I/B/E/S forecast data are used. To measure differential forecast revisions, I searched for monthly 


t Ziebart (1990) is based on forecast summary data. This limits the types of disagreement measures that can be constructed. 

? Abarbanell et al. also suggest that the number of analysts following a firm is a potentially important contro! variable: 
for empirical studies that use analysts' forecasts to proxy for investors' expectations. When the number of analysts is 
included in tests performed in this study, I find a statistically significant negative association between the number of 
analysts and trading volume. I do not report this result for two reasons: First, it does not affect qualitative conclusions 
concerning the measures of disagreement examined in this study. Second, the number of analysts is correlated with firm 
size and statistically insignificant when included with firm size in regression models. 

19 Sell-side analysts are the primary producers of earnings forecasts. Sell-side analysts serve individual and institutional 
investors, whereas buy-side analysts' tend to be employed by institutional investors or moncy management firms. Buy- 
side forecasts are not used in this study for two reasons. First, /B/E/S does not provide individual identification numbers 
for analysts making buy-side forecasts. Second, buy-side and sell-side analysts are likely to face dissimilar forecasting 
incentives. Thus, disagreement measures containing a disproportionate number of buy-side forecasts may be 
systematically biased. 
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observations in which six or more forecast revisions are recorded for each firm. Morse et al. (1991) 
suggest that for a given month, the probability of a revised forecast of the next year's earnings is 
about 0.2 for the average analyst. Thus, I limited the search for monthly observations to firms 
followed by at least 30 analysts during the period January 1984 to December 1990. Detailed I/B/ 
E/S tapes contained 203 firms with a following of 30 or more analysts during this period. Ten 
months were missing from the IBES tapes." In addition, I excluded October 1987 (the month of 
the market crash) because of systematically unusual trading volume. The “full” sample then consists 
of 6,727 firm-months, representing 73 months and 172 firms. This full sample is constructed from 
216,155 forecasts (64,558 forecast revisions). 

Unfortunately, the potential influence of nonsynchronous forecast updating and heteroge- 
neous update frequencies suggest alternative explanations for observing associations between 
trading volume and disagreement measures in the full sample. Differences in analysts' updating 
practices likely result in some forecasts being more outdated or “stale” than others. Figure 2 
depicts how news releases can result in uncorrelated forecast revisions and changes in dispersion. 
News releases may result in forecast updating by a subset of analysts. This can cause the relative 
positions of recorded forecasts to change simply because outdated forecasts do not move. Thus, 
stale forecasts cause measurement error in disagreement proxies. 

To mitigate the influence of stale forecasts, I focused on a "reduced" sample that uses only 
forecasts that are revisions of forecasts made one month prior (1.e., revisions of recent revisions). 
This ensured that forecast revisions are associated with beliefs actually revised in the current 
month. The reduced sample contains only those firm-months that had (1) at least four forecasts 
that met this selection criterion, and (2) monthly NYSE price and trading volume data available 
from CRSP (Center for Research into Security Prices). This sample consists of 8,120 prior period 
forecasts (with 8,120 paired revisions) and 1,520 sample observations (i.e., firm-months) 
representing 166 firms (appendix À contains more detail concerning the sample). 

To assess whether different measures of analysts' disagreement have incremental explana- 
tory power, I used the reduced sample to estimate the following regression: 


In(%Vol,)= at a, In(Disp, ,) + a, ADisp, + a,In(1-p,)+ u, 


where: 
%Vol, = Percentage of outstanding shares traded for firm j in month t. is 
Disp, , = Coefficient of variation (i.e., the standard deviation divided by the absolute 


mean) of analysts’ forecasts for firm j in month t-1. ? 


11 The missing months in my sample are January 1984, November 1984, July 1985, August 1985, November 1985, April 
1986, May 1986, September 1986, March 1987 and December 1990. I/B/E/S informed me (after this article was 
accepted) that they have been concerned about these missing months and have now recovered some. 

HI employed % Vol, for two reasons: First, it is consistent with prior accounting studies that focus on the general effects 
of financial information (e.g., Comiskey et al. 1987. Ajinkya et al. 1991). Second, *Vol, controls for the number of 
outstanding shares. In Karpoff's model, volume increases proportionally with the number of outstanding shares. 
However, 36 Vol, tends to vary across firms, so it is possible that the association between firms' average trading volume 
and analysts' disagreement is caused by some omitted firm-specific variable other than disagreement. One method of 
addressing this issue might be to use only disagreement and volume changes, or to use mean-adjusted volume and 
disagreement measures. Unfortunately, there are very few contiguous observations of disagreement in the reduced 
sample and only a few (sometimes only one) observations for many firms. Moreover, when disagreement itself varied 
across firms, the effect of interest would be partially eliminated. 

D Scaling the standard deviation of forecasts by the absolute mean forecast remedies heteroscedasticity and maintains 
consistency with prior empirical studies. This scaling raises the possibility of results being influenced by small values 
of the denominator. Regression results are qualitatively similar results using the measure unscaled, however. Further, 
regressions yield qualitatively similar estimates using all variables without scaling or log normalizing. It may be, 
however, that other methods of measuring or scaling dispersion are more appropriate. For example, measures scaled 
by stock price are not as subject to small values in the denominator (see note 17). 
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ADisp,, 
D, 


Change in the coefficient of variation of analysts’ forecasts for firm j in month t.'4 
The Pearson correlation between individual analysts' forecasts made for firm j 
in month t-1 and the corresponding forecasts made in month t. 

I also estimated alternative models that control for variables beyond the scope of Karpoff's 
model. These control variables are: 
Ir. = the absolute value of firm j’s stock return during month t. 
Size, = the total market value of firm j’s stock at the end of month t. 
%Vol,,, = monthly percentage of outstanding NYSE shares traded. 


MT chose to use ADisp, for comparison to Ziebart (1990). Yet, all coefficient estimates are qualitatively similar when the 
current period level of dispersion (i.e., in(Disp,)) is used as a substitute for ADISP,. This is not surprising since 
Disp, , is already in the model and Disp, Disp, , + ADisp,. 


FIGURE 2 
Influence of *Stale" Forecasts on Disagreement Measures 
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Panel B: Same news as in Panel A, but with “stale” forecasts. 
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Several of the above variables are transformed for regression analysis because of skewness. '° 
The correlation measure is also inverted to reflect the jumbling of forecasts, (1—p,), rather than its 
absence (p).'© 


IV. DESCRIPTIVE STATISTICS AND TEST RESULTS 


Table 1 provides descriptive sample statistics. The average trading volume for sample 
observations is greater than the average market-wide trading during the sample period. This is to 
be expected, since news events are associated with increased trading volume and firm-months 
containing news events are more likely to contain the large number of forecast revisions required 


!5'To lessen departures of residual errors from normality in estimated regression models and maintain consistency with 
prior studies, the log of trading volume, prior dispersion and correlation measures are used. ADisp, is not transformed 
because it is not highly skewed and takes on negative values. The return measure is not transformed because the sample 
of raw returns is normally distributed (no outliers) and frequency distributions suggest that transforming absolute returns 
may change the nature of this variable. Results are qualitatively similar using log normalized absolute returns, however. 
The effects of extreme values are also moderated by using log transformations. Nevertheless, I employed several 
diagnostics to assess the potential influence of outliers. For example, alternative tests were conducted after eliminating 
observations if either %Vol Disp, lADisp,] or (1—p,) are in their respective upper 2nd percentile. This procedure 
reduces the sample size by about eight percent, but 3t does not change results qualitatively. In addition, results are not 
changed qualitatively when the sample size is reduced another 12 percent by eliminating observations for which the 
correlation variable of primary interest is negative (i.e., p, 0). 

To simplify notation, the correlation measure is denoted as the simple additive inverse. It is actually computed as 
In(1.1— p,), however, so that observations are not lost when p =i. Further, there is one observation in which p, is 
technically undefined, because the variance of current period forecasts (which is used to compute its denominator) is 
zero. For this observation I assigned p,à value of ore. This assignment is somewhat arbitrary, but also consistent with 
Karpoff’s notion of complete consensus following diversity in prior beliefs. Test results are virtually identical when this 
Observation is omitted. 





TABLE 1 
Descriptive Statistics for Regression Variables (untransformed) 
N=1520 
96 Vol, P, Disp, ,  ADisp, Ty $Vol,, Sie, 
Maximum 0.595 1.000 15.481 10.745 0.414 0.064 102,027 
Median 0.059 0.647 0.055 —0.004 0.014 0.044 4,562 
Minimum 0.012 -0.995 0.003 -15.222 -0.338 0.030 111 
Std. Dev. 0.049 0.466 0.637 0.710 0.080 0.008 13,003 
Mean 0.071 0.494 0.158 —0.008 0.016 0.046 : 8,715 
Vol = 9% of firm j’s shares traded during month t. 
& Vol, = 96 of NYSE shares traded during month t. 
p = the correlation between analysts'current and prior months earnings forecasts. 
Disp, = coefficient of variation (Le., standard devisition scaled by imeanl) of prior months earnings forecasts, 
ADisp, = change in the above coefficient of varietion during the current month. 


return on firm j’s stock during month t, 
Size, the market value of firm j's equity at the end of month t Gn millions of dollars). 
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for inclusion in the sample. Selection criteria also result in large sample firms, although there is 
significant variation in size. Further, the median correlation between current and prior forecasts 
is .65, although values of this measure range from 1.00 to ~.99. 

Table 2 provides pairwise correlations on the transformed variables. Pairwise correlations 
suggest that firm size is associated with analysts’ agreement since both Disp, , and (1-p,) tend 
to be lower for larger firms. This is consistent with the notions that there is more information 
available about large firms, and more of this information is commonly observed. Pairwise 
correlations also suggest that firm-specific disagreement (or relatively high levels of both 
Disp, , and (1—p,)) tends to be higher when there is more market-wide trading. These relations 
are consistent with the notion that market-wide disagreement about economy-wide factors 
influences both market-wide trading volume and analysts' disagreement about individual firms. 
Karpoff' s model predicts that trading volume is an increasing function of both differential prior 
beliefs and differential contemporaneous belief revisions. This prediction is supported by simple 





TABLE 2 
Pairwise Pearson Correlations for Regression Variables (transformed) 
(Two-Tailed Probability Values Italicized) 


in( $ Vol.) 
inf ®Vol,) 1.000 
0.000 
In(1-pj) 
In( 1-p,) 0.056 1.000 
0.030 0.000 
In(Disp, ) 
In(Disp, , ) 0.173 —0.068 1.000 
0.000 0.008 0.000 
ADisp, 
ADisp, 0.025 —0.026 —0.163 1.000 
0.324 0.306 0.000 0.000 
| ry 
Ir, 0.310 0.000 0.089 -0.118 1.000 


0.000 0.986 0.001 0.000 | 0.000 
In(%Vol,,,) 
In(%Vol,,,) 0.190 0.050 0.049 0.010 0.057 1.000 
0.000 0.052 0.054 0.707 0.026 0.000 
In(SIZE,) 
In(SIZE,) 0.345 -0.056 -0.297 0.016 -0.213 0.052 1.000 
0.000 0.030 0.000 0.524 0.000 0.044 0.000 


Vol, = % of firm j’s shares traded during month t. 

9*Vol,, = % of NYSE shares traded during month t. 

Py = the correlation between analysts’ current and prior months earnings forecasts. 

Disp,, = coefficient of variation (i.c., standard deviation scaled by imeanl) of prior months earnings forecasts. 
ADisp, = change in the above coefficient of variation during the current month. 

r = return on firm j's stock during month t. 


the market value of firm j's equity at the end of month t (in millions of dollars). 
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correlation tests in table 2, which reveal statistically significant positive associations between 
$c Vol, and both Disp, , and (1-p). 

Regression estimates of the incremental associations between disagreement measures and 
trading volume are presented in table 3.!” Standard errors on the coefficients are calculated using 
a procedure developed by Froot (1989) that adjusts for dependency in residual errors when 
generalized least squares is infeasible.'* Estimated coefficients for the reduced sample have signs 
consistent with all three research hypotheses.'? The correlation between analysts’ current and 
prior period forecasts is inversely related to trading volume, supporting H1. Results support H2 
because the level of prior dispersion in analysts' forecasts is positively related to trading volume. 
Finally, evidence also supports H3 because changes in analysts' forecast dispersion are positively 
related to trading volume. Further, all three research hypotheses are supported after controlling 
for price changes and either firm size or market wide trading.” In other words, all three 
disagreement measures are associated with wading volume that is not explained by price changes, 
firm size or the effects of market-wide trading.” The explanatory power ( R?,) of models in 
table 3 is comparable to that achieved in tvpical market-based event studies. Nevertheless, the 
explanatory power of disagreement measures alone is low in an absolute sense. ? 

Table 3 also reports estimates from the "full" sample.” I report these estimates because of 
a potential bias in the reduced sample. I cannot distinguish unrevised forecasts that are stale from 
unrevised forecasts that accurately reflect some investors' expectations, and using only revisions 


UT also report the following regression estimates using stock price to scale the standard deviation of forecasts: 


In(&Vol ,) = b, +bin(Disp,,)  +b,ADisp,  +bIn(I-p,) +b, Ir,J+e 
Coeff. -2.513 40.083 43.005 40.053 — 4 3.12)  N-1,520 
(Std. Err. Froot) — (0.142) (0.022) (1.174) (0.019) (0.318)  R3,—.12. 


All estimates using price to scale dispersion measures are qualitatively similar to those found in table 3, with the 
statistical significance of the coefficient for the correlation between current and prior forecasts being slightly higher 
when price is used. 

i8 The Froot procedure adjusts standard errors when multiple observations for the same firms are used. It also adjusts for 
the serial correlation in monthly trading volume. Yet. this procedure does not adjust for dependency within observations 
from the same month. This type of dependency is not a serious concern, however, because results are not greatly 
influenced by adding market-wide trading as acontrcl variable. Monthly NYSE trading volume is highly correlated with 
the average volume in my sample (i.e., a .95 Pearson correlation for both the full and reduced samples). 

V Evidence of an association between trading volume and different aspects of analysts’ disagreement is found in all annual 
subperiods. For example, when coefficients from the seven annual subperiods are estimated (using stock returns as a 
control), 18 of the resulting 21 coefficients on disagreement are positive. Six of the seven coefficients on Inf 1-p,) are 
positive. All negative coefficients are statistically insignificant and from different years. 

? Multücollinearity is extremely high when both firm size and market-wide trading are added as control variables, although 
the results still support Karpoff s predictions. When both these variables are added to regression models, test statistics 
(t value (Froot)) for the correlation measure are +1.49 or +1.68 depending on whether dispersion measures are scaled 
by the mean forecast or stock price (see notes 13 and 17). A high level of multicollinearity in these models is 
understandable if both market-wide trading and firm size proxy for investor disagreement. 

?! As note nine suggests, models were also tested for sensitivity to inclusion of other variables. The magnitude of analysts’ 
mean forecast revision, or surprise (denoted SURPRISE,) was also tested, although it is positively correlated with the 
magnitude of returns and likely to reflect similar economic influences. Surprise metrics have been used in other studies 
(e.g., Bamber 1986; Ajinkya et al. 1991; Ziebart 1990). Qualitative conclusions concerning analysts’ disagreement are 
not influenced by inclusion of SURPRISE x (for an example of a model including SURPRISE, see footnote 24). 

Z My dissertation (Le., Barron 1993) argued that low R?s can result from disagreement studies even when market-wide 
disagreement is the sole cause of trading volume and analysts’ earnings forecasts are ideal proxies for investors’ 
expectations of cash flows. The argument has two parts. First, noise in financial markets produces measurement error 
in disagreement proxies constructed using only a subset of expectations. Measurement error partially obscures the 
association between trading volume and market-wide disagreement. Second, transaction costs dampen and distort the 
disagreement-volume relation. Simulations suggest that this “friction-effect” greatly amplifies the adverse influence of 
measurement error on R?. This argument suggests that R?s in studies like this one are likely to understate the economic 
significance of the disagreement/volume association. 

E uc it was not technologically feasible to calculate “Froot” standard errors for the full sample because of its 
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TABLE 3 
Regression Estimates of the Incremental Roles of Differential Prior Beliefs and 
Differential Belief Revisions in Explaining Trading Volume 


In(96 Vol 7858, In(Disp, , a, ADisp, *ajln(1 -P tu, 
Reduced Sample (includes only forecasts that are revisions of the prior month’s AETA 
N=1,520 


Disagreement , 
Differential Differential Belief Average Belief Other 
Prior Beliefs Revisions Revisions Variables 
R24 = 0.04 Intercept \n(Disp,,)  ADisp, ^ ln(-p) 
Predicted Sign (+) (+) (+) 
Coefficient —2.516 0.091 0.045 0.048 
Std Error (O.L.S.) 0.041 0.012 0.020 0.017 
Std Error (Froot) 0.082 0.020 0.021 0.020 
t value (Froot) —30.697 4.566** 2.169* 2.365** 
Riy=0.13 ir, 
Coefficient —2.743 0.080 0.070 0.048 3.191 
Std Error (O.L.S.) 0.043 0.012 0.019 0.017 0.252 
Std Exror (Froot) 0.073 0.017 0.018 0.019 0.336 
t value (Froot) —37.560 4.666** 3.835** 2.567** 9.509** 
R2,—0.15 in(%Vol,,.) 
Coefficient —1.101 0.076 0.067 0.042 3.097 0.532 
Std Hrror (O.L.S.) 0.247 0.012 0.019 0.016 0.249 0.078 
Std Error (Froot) 0.265 0.018 0.019 0.018 0.337 0.088 
t value (Froot) —4.151 4.335** 3.463** 2252" 9.179** 6.059** 
R2,20.19 In(Size,) 
Coefficient 0.154 0.042 0.058 0.034 2.662 -0.134 
Std Error (O.L.S.) 0.277 0.012 0.018 0.016 0.259 0.013 
Std Error (Froot) 0.749 0.016 0.020 0.018 0.282 0.034 
t value (Froot) 0.206 2.633** 2.953** 1.923* 9.447** —3.930** 
Full Sample (includes all forecasts for months with at least six revisions) 
N=6,727 
R24: 0.19 Intercept In(Disp,,) ^ ADisp, (i-p Iri —— In(%Vol,,) 
Coefficient —0.558 0.103 0.003 0.04 2.583 0.693 
Std Error (O.L.S.) 0.114 0.006 0.004 0.013 0.106 0.035 
t value (O.L.S.) —5.040 17.571** 0.750 3.207**  24261**  19.945** 


* and ** denote statistical significance (one tailed) at the .05 and .01 levels 


Vol =  % of firm j’s shares traded during month t. 
& Vol, = 9% of NYSE shares traded during month t. 
= return on firm j’s stock during month t. 
Bip, = coefficient of variation (i.e., standard deviation scaled by mean!) of prior months earnings forecasts. 
ADisp, = change in the above coefficient of variation during the current month. 
p = correlation between analysts’current and prior months earnings forecasts. 
Size, = Sil cnt ab era fin LIAA Or dollas 
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of the prior month’s forecast revisions may introduce measurement error when unrevised 
forecasts accurately reflect current period expectations. Further, disagreement measures in the 
reduced sample are based on relatively few forecasts (sometimes only four). This is also likely 
to contribute to measurement error in the reduced sample. Using the full sample of 6,727 firm- 
months as a robustness check, test results support H1 and H2, but not H3. That is, trading volume 
is positively related to the level of prior dispersion in analysts’ forecasts and inversely related to 
the correlation between current and prior period forecasts, but the positive coefficient on the 
change in dispersion measure is statistically insignificant in the full sample. One explanation is 
that H3 is not suggested by Karpoff’s model. Another explanation is that stale forecasts obscure 
the level of divergence in beliefs, consistent with evidence suggesting stale forecasts are likely 
to obscure measures of convergence (or divergence) in beliefs (Brown and Han 1992; Stickel 
1995). 


V. CONCLUDING REMARKS 


Results from this study are consistent with Karpoff' s (1986) predictions. That is, differential 
belief revisions and prior dispersion in beliefs both explain trading volume. If the magnitude of 
price changes controls for the influence of the average belief revision made by investors, then this 
study also provides evidence consistent with the theoretical prediction that trading volume is 
influenced by both the average and differential components of belief revisions (e.g., Kim and 
Verrecchia 1991). 

The conventional wisdom that investor disagreement causes trading volume is consistent 
with theoretical research (e.g., Karpoff 1986; Jang and Ro 1989; Holthausen and Verrecchia 
1990; Kim and Verrecchia 1991; Dontoh and Ronen 1993). Most prior empirical studies find a 
positive relation between disagreement measures and trading volume, although the results of 
these studies are somewhat mixed and subject to measurement error (Abarbanell et al. 1995). This 
study addresses measurement concerns, while further supporting and extending prior evidence 
suggesting that disagreement causes trading volume. First, this study corroborates two disagree- 
ment/volume relations reported in prior accounting studies after eliminating potentially stale 
forecasts and controlling for price changes (Ziebart 1990; Ajinkya et al. 1991). Second, it shows 
that these two disagreement measures have incremental explanatory power in regression models. 
Finally, this study shows that a new measure of disagreement (i.e., the correlation between 
analysts' current and prior forecasts) also has incremental explanatory power for trading volume. 

These results add to prior evidence suggesting that disagreement is a cause of the trading 
reactions observed around accounting events (e.g., Ziebart 1990). However, this study does not 
address the relative importance of differential prior beliefs and differential contemporaneous 


“Barron (1993) focused on contemporaneous forecast dispersion (denoted Disp,) rather than prior forecast dispersion. 
For comparison to Ajinkya ct aL (1991), I report the following alternative regression results: 
In(96Traded), = Cy + c, In(Disp,) +c In(1-0,) +c sin(SURPRISE,) +err + æ 
Coeff. i —2.488 4 0.044 + 0.043 + 0.068 + .980 N=1,520 
(Std. Err. Froot) (0.120) (0.020) (0.020) (0.025) (.310) R? = 13 


The theoretical basis for this test is Holthausen and Verrecchia’s (1990) informedness/consensus model. The 
dissertation argued analytically that the dispersion in expectations is an inverse function of both informedness and 
consensus. Further, p, may be an explicit proxy for “consensus.” If p, controls for consensus in the above model, then 
dispersion is left to proxy for informedness. Further, if dispersion proxies far informedness, then its positive coefficient 
is inconsistent with the predictions of Holthausen and Verrecchia (1990). One explanation for this inconsistency is that 

contemporaneous dispersion proxies for prior period information asymmetry rather than informedness. Another 
explanation is the influence of transaction processing costs. Barron (1993) also argued that transaction processing costs 
are likely to prevent more trades when the dispersion in expectations is small. 
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belief revisions in explaining trading reactions around specific events. The relative importance 
of these two influences is potentially important to accountants interested in evaluating the 
“content” or “quality” of financial disclosures. According to Karpoff (1986), a trading reaction 
caused primarily by the resolution of prior period disagreement suggests news with a different 
quality (i.e., news with a common interpretation) more than a trading reaction caused primarily 
by differential belief revisions. Thus, future studies of trading volume reactions around specific 
types of accounting events (e.g., earnings releases) may benefit from using the correlation . 
between analysts’ current and prior forecasts. 

Use of the correlation measure introduced in this study may extend beyond the study of 
trading volume redctions. A high correlation between the relative positions of analysts’ current 
and prior forecasts likely reflects the common interpretations referred to by Karpoff (1986), 
especially when this correlation is measured immediately surrounding an event that reduces 
forecast dispersion. Evidence that an announcement has been commonly interpreted also 
suggests that its contents have been communicated clearly. Accountants strive to create reports 
that communicate clearly. Thus, the correlation between current and prior forecasts could be 
useful in tests for evidence of common interpretations around different types of financial reports. 
Like traditional dispersion measures, this correlation measure can be constructed whenever 
sufficient quantities of reliable expectational data are obtainable, whether in the field or 
laboratory. 
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EMPIRICAL DATA USED FOR THIS STUDY 


For firms having a 
following of 30 or 
more analysts during 
the period from 
January 1984 to 
December 1990. 


Total number of one 
year ahead forecasts 


Number of one year 
forcasts in a series of 
two or more 


Number of forecast 
revisions 

Number of firms 
represented 


Number of firm- 
months represented 


APPENDIX A 


I/B/E/S forecasts for 
identifiable sell-side 
analysts (denoted 
SS) 


452,457 


386,038 


116,150 


203 


13,083 


"Full" Sample: 
VB/E/S (SS) fore- 
casts (non-October 
1987) made during 
firm-months having 
2 6 forecast 
revisions and CRSP 
volume and price 
data available. 


246,161 


216,155 


64,558 


172 


6,727 


595 


“Reduced” Sample: 
VB/E/S (SS) 
forecasts revisions 
(non-October 1987) 
of the prior month’s 
forecast revision 
made during months 
having 2 4 such 
revisions and CRSP 
volume and price 
data available. 


8,120 


8,120 


8,120 


166 


1,520 
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I. INTRODUCTION 


ricing equity derivatives—stock options, warrants, convertible debt, and convertible 
P preferred stock—is an important topic in accounting and finance. Pricing influences 

whether managers issue these securities and whether investors hold them. Pricing also 
Occurs to satisfy financial reporting obligations, as described in FAS No. 107, Disclosures about 
Fair Value of Financial Instruments, Securities and Exchange Commission (SEC) Regulation S- 
K on executive compensation disclosures in proxy statements, and a staff interpretation of SEC 
Staff Accounting Bulletin No. 57 on accounting for equity derivatives issued to nonemployees.! 
Further, the Financial Accounting Standards Board (FASB) currently is studying additional 
financial reporting requirements that would expand the need to value equity derivatives. These 
include the Discussion Memorandum, Distinguishing Between Liability and Equity Instruments 
and Accounting for Instruments with Characteristics of Both, and the Exposure Draft (ED), 
Accounting for Stock-based Compensation. | 

An important factor in pricing equity derivatives is the volatility (standard deviation) of the 
return on the underlying common stock.? In most option pricing models, this volatility is a key 
input: itisunobservable and difficultto forecast, and option values (and hence profits from trading 
options) are sensitive to errors in estimating volatility.? For this reason, there is a large literature 
on the estimation of volatility.* However, this literature focuses on short-term volatility for 
pricing exchange-traded options, and hence it may not apply directly to the problem of predicting 
long-term volatility for a broad cross-section of firms. Firms with exchange-traded options tend 
to be large, well-established firms with actively traded stock compared to firms in general and to 
firms that make heavy use of convertible securities and employee stock options (including IPO 
firms) in particular. Moreover, the stability of volatility is likely to differ over short versus long 
horizons, and data requirements make some of the forecasting techniques (e.g., those involving 
implied volatility) impracticable for long horizons. 

Since most studies focus on short-term volatility for exchange-traded options, there is little 
research on predicting volatility for pricing the long-maturity equity derivatives of greatest 
interest to accountants: convertible debt, convertible preferred stock, warrants, and employee 
stock options.? The lack of research on predicting long-term volatility is especially problematic 


! In a speech delivered at the 1995 22™ Annual National Conference on Current SEC Developments, Michael Morrissey 
stated that the SEC staff would accept date-of-issuance fair value as a basis for recognizing expense in conjunction with 
equity options issued to nonemployees. Moreover, in its report on derivatives, the U.S. General Accounting Office 
suggests the FASB should adopt market value accounting for all financial instruments (GAO 1994, 106): "[A market 
value] accounting model would not only solve many of the accounting issues concerning derivatives but, more 
importantly, would provide a new level of transparency in financial reporting of hedging activities." 

? For an introduction to the pricing of options, see Cox and Rubinstein (1985) or Deloitte & Touche (1994). Black and 
Scholes (1973) present a model to price a European call option (exercisable only at maturity) on a non-dividend paying 
stock that follows a log-normal distribution with constant variance. Several researchers have extended the Black- 
Scholes model to allow for the payment of cash dividends (Roll 1977; Geske 1979; Whaley 1981), alternative stock 
return distributions (Cox and Ross 1976; Merton 1976), transactions costs (Leland 1985), stochastic volatility (Hull and 
White 1987), and non-transferability (Huddart 1994). Lambert et al. (1991) and Huddart (1994) contrast the effect of 
volatility on option values from the perspective of issuers and investors versus (risk-adverse) employees. 

3 Volatility is so important that traders often quote options in terms of Black-Scholes-equivalent volatilities rather than 
dollar prices (Derman and Kani 1994). 

* Beckers (1981) and Lamoureux and Lastrapes (1993) compare the predictive ability of implied volatility and historical 
volatility; Geske and Roll (1984) and Karolyi (1993) investigate Bayesian (shrinkage) techniques; and Engle et al. 
(1993) and Noh et al. (1993) examine stochastic volatility models 

5 The Crystal Report (1994) develops a shrinkage estimator that takes account of mean reversion in long-term volatility; 
however, the performance of the estimator is not discussed. 
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given the FASB’s current attention to accounting for stock-based compensation. Several 
participants of the ED's field test called for more guidance on how to compute volatility. The 
report on the results of the field test states (FASB 1994a, 4-5): 


The area of most concern was the assumption about expected volatility. Many 
participants requested more guidance on how to compute historical volatility and how 
to adjust historical volatility to estimate expected volatility. 

Some participants suggested that the FASB prescribe the method and period...that 
companies should use to compute historical volatility on which to base their estimates 
of expected volatility. 


Many commentators on the stock-based compensation project have argued that valuation 
models for employee stock options are too imprecise to justify requiring recognition for financial 
reporting purposes (Coopers & Lybrand 1993; Beese 1994; Share Data 1994; The Wyatt 
Company 1994a, 1994b). In deference to these and other concerns, the FASB has announced that 
it would not require expense recognition for stock options, but would instead permit companies 
electing nonrecognition to disclose pro forma net income effects of expense recognition in a note 
to the financial statements (FASB 1994b). That action ensures continuing interest in the 
estimation of long-term volatility. 

The purpose of this study is to examine empirically the prediction of long-term stock return 
volatility, where long-term volatility is computed using monthly stock returns overfive years. We 
focus on three primary questions: First, when using a simple extrapolation of a firm’s historical 
volatility to forecast future volatility, with what frequency should returns be computed and over 
what historical period should volatility be measured? Second, how accurate is a prediction based 
exclusively on historical volatilities of comparable firms? This question is particularly relevant 
for firms lacking sufficient data to compute historical volatility (e.g., IPO firms). Third, for firms 
with enough data to compute historical volatility, how accurate is a shrinkage forecast formed by 
adjusting a historical forecast toward a comparable-firms forecast? 

Taken together, these three questions provide insight on two important tradeoffs. The first is 
between more data—a greater return frequency or a longer measurement period—and extreme 
or obsolete data. The use of more data leads to a lower standard error of the estimate of volatility, 
but the use of higher frequency returns or a longer measurement period increases the chances 
extreme returns will influence the estimate of volatility or the possibility volatility has changed 
over time. The second tradeoff is between more, but potentially irrelevant, data for comparable 
firms. The use of comparable firms expands the information set available to estimate volatility; 
at the same time it increases the chances irrelevant data will be used in estimation. 

The results of this study indicate the following. First, when using historical volatility to 
forecast future volatility of monthly returns over five years, returns should be measured either 
weekly or monthly and the historical period should be approximately five years. Second, if data 
exist to compute historical volatility, then a historical forecast is more accurate than any of the 
six comparable-firms forecasts we examine. If data are not available to compute historical 
volatility, then selecting comparable firms on the basis of industry and firm size provides the best 
comparable-firms forecast. Third, a'shrinkage forecast is more accurate than either a historical or 
comparable-firms forecast. 

We believe our research will contribute to the development of accounting foremployee stock 
options by providing evidence on two related questions: One, what is the best estimate of long- 
term volatility? And two, how accurate is this estimate? An answer to the second question will 
help preparers of financial statements decide whether the value of employee stock options can be 
measured with sufficient precision to support voluntary recognition for financial reporting 
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purposes, while the first question is relevant to the estimation of stock option values, either for 
recognition on the income statement or disclosure in the notes. The answer to the first question 
is important to those responsible for implementing the FASB’s forthcoming financial reporting 
requirements, as well as to investors and accounting researchers interested in interpreting 
compensation disclosures. 

The remainder of this paper is organized as follows. Section II describes sample selection and 
presents descriptive statistics. The empirical results appear in section III, and a discussion of the 
results appears in section IV. Finally, section V summarizes the paper. 


II. SAMPLE SELECTION AND DESCRIPTIVE STATISTICS 


This paper examines empirically the prediction of long-term volatility. We compute volatility 
using monthly continuously compounded stock returns over the five years after the forecast date 
(four years minimum). As discussed in greater detail below, we use monthly stock returns because 
monthly—but not weekly or daily—stock returns are approximately normally distributed, in keeping 
with the assumptions of the Black-Scholes model. We refer to the five-year period as the forecast 
period, and we denote volatility over this period FUTURE. Five years is chosen to reflect the long 
expected term typical of many convertible securities (before they are called) and employee stock 
options (before they are exercised voluntarily).$ The forecast date for each firm-period is taken to 
be six months after the end of the firm's fiscal year. A six-month forecast lag is chosen to match 
the reporting lag assumed for the comparable firms, an assumption designed to ensure that accoun- 
ting data are publicly available for the comparable firms (Alford et al. 1994). 

To avoid confusion, the firms for which FUTURE is to be predicted are referred to as target 
firms (as opposed to comparable firms later on). Our sample of target firm-periods satisfies the 
following criteria: 

1. The firm appears on the merged annual Compustat file and either the NYSE/ASE or 
NASDAQ daily CRSP files prepared by the Center for Research in Security Prices at the 
University of Chicago. 

2. The fiscal year ends during the December, 1966 to June, 1987 period. With a six-month 
forecasting lag, December, 1966 (June, 1987) is the earliest (latest) fiscal year end such that 
five years of prior (subsequent) data are available on the NYSE/ASE and NASDAQ daily 
CRSP files. 

3. Market value of equity (MVE) is available on the forecast date, and data to compute the ratio 
of total long-term debt plus preferred stock to total assets (D/TA) are available for the fiscal 
year preceding the forecast date. As described later in the paper, MVE and D/TA are used to 
identify comparable firms. 

4. "The firm has at least 48 monthly returns available during the 60-month (five-year) forecast 
period to compute FUTURE. 

5. The firm does not already appear in the sample during the previous four years. Therefore, 
forecast-periods for a specific firm do not overlap since a firm never enters the sample more 
than once in any five-year period. 


These criteria resulted in a sample of 13,851 firm-periods for 6,879 distinct firms.’ 


$ Convertible securities are often call protected for the first few years, with a maximum horizon equal to the security' s 
maturity, if any. Employee stock options typically vest over three years and have a ten-year maturity. 

7 The sample of target firm-periods exhibits no unusual clustering by one-digit industry (relative to the time-weighted 
Compustat universe) or by time; however, the firms in the sample are slightly larger than those on COMPUSTAT. The 
median market value of equity is $40.3 million for the sample of target firm-periods versus $33.1 million for the 
Compustat universe (aggregated over the sample period). 
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We use monthly stock returns to compute FUTURE because monthly returns are approxi- 
mately normally distributed, while daily and weekly returns are not. The use of returns that are 
normally distributed is appealing because the Black-Scholes model assumes stock prices follow 
a log-normal distribution, which is equivalent to assuming continuous returns follow a normal 
distribution. Therefore, by using monthly returns, the distribution of returns is consistent with the 
assumptions of the option pricing model. 

Evidence on the distribution of daily, weekly (Tuesday-to-Tuesday), and monthly stock 
returns appears in table 1. For each return frequency, we present the distribution of p-values from 
Kolmogorov-Smirnov goodness-of-fit tests of normality over the forecast period for our sample 
of firm-periods.* Under the null hypothesis of normality, the p-values should be approximately 
uniformly distributed between 0.0 and 1.0 (so that rejection rates and significance levels 
correspond to one another). The results in table 1 provide overwhelming evidence that neither 
daily nor weekly returns are normally distributed. For daily returns, the null hypothesis is rejected 
for 99.7 and 99.9% of the firm-periods at the 0.01 and 0.10 levels, while for weekly returns the 
rejection rates are 51.8 and 75.3%, respectively. In contrast, normality of monthly returns is 
rejected for only 2.4 and 9.5% of the firm-periods at the 0.01 and 0.10 levels. These results suggest 
monthly—but not weekly or daily—returns are approximately normally distributed for our 
sample. These results are consistent with earlier research summarized in Fama (1976). 

Although volatility to be predicted, FUTURE, is computed with monthly returns over five 
years (four years minimum), the efficacy of using past volatility to predict future volatility is 
investigated using several different methods of estimating past volatility: daily, weekly, and 


t We also examined the distributions of the studentized range, and the results were very similar to those for the 
Kolmogorov-Smirnov tests reported in the paper. 


TABLE 1 


Frequency distribution of p-values from Kolmogorov-Smirnov (K-S) goodness-of-fit 
tests of normality of daily, weekly, and monthly continuously compounded returns over 
the sample of non-overlapping forecast periods (13,851 observations) 


Daily Weekly Monthly 
K-S Freq Cum Freq Freq Cum Freq Freq Cum Freq 
p-value (%) (%) (%) (%) (*) : (9) 
0.00 - 0.01 99.7 99.7 51.8 51.8 2.4 2.4 
0.01 - 0.02 0.1 99.8 6.3 58.1 1.0 3.4 
0.02 - 0.03 0.1 99.9 3.8 61.9 0.9 4.3 
0.03 - 0.04 0.0 99.9 3.1 65.1 0.7 5.0 
0.04 - 0.05 0.0 99.9 2.4 67.4 0.6 57 
0.05 - 0.06 0.0 99.9 1.9 69.3 0.8 6.5 
0.06 - 0.07 0.0 99.9 1.7 71.0 0.7 7.2 
0.07 - 0.08 0.0 99.9 1.6 72.7 0.8 8.0 
0.08 - 0.09 0.0 99.9 1.5 74.1 0.7 8.7 
0.09 -0.10 0.0 99.9 1.2 75.3 0.7 9.5 


0.10 - 1.00 0.1 100.0 24.7 100.0 90.5 100.0 
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monthly returns over four historical periods. We denote historical volatility for return frequency 
f over the y-year period before the forecast date by HIST. f. y, where f = D, W, and M (daily, 
weekly, and monthly, respectively), and y = 1, 3, 5, and 7 years? l 

Quartiles for HIST f y, as well as for FUTURE, MVE, and D/TA appear in panel A of table 
2. The samples for each variable are not the same since our sample selection criteria for FUTURE 
do not require that HIST_f_y be available, which allows us to investigate the prediction of 
volatility for firms lacking historical data (e.g., IPO firms). For each y-year estimation period, the 
median volatility is greatest for weekly returns but (usually) similar for daily and monthly returns, 
while the dispersion of volatility, as measured by the interquartile range, is greatest for daily 
returns and least for monthly returns. For instance, for the five-year period, the medians of daily, 
weekly, and monthly volatility are 0.377, 0.387, and 0.378, while the interquartile ranges are 
0.243, 0.227, and 0.217. In contrast, the median of FUTURE is 0.415. The differences between 
the respective fractiles of FUTURE and HIST. f. 5 reflect the fact that volatility is higher on 
average for the mostly smaller firms without historical data available to compute HIST f 5.!9 
Volatility equal to 0.415 (the median of FUTURE) corresponds to a value of $46.04 for a standard 
at-the-money option, where a standard option is defined to be a call option on a stock with a price 
of $100, a dividend yield of zero, an option term of five years, and a simple riskless interest rate 
of 0.065. 

Spearman rank correlations between HIST D. y, HIST W _y, and HIST_M_y for each value 
of y appear in panel B of table 2. For a given value of y, daily, weekly, and monthly volatility are 
highly correlated with each other. The correlation between each pair of volatilities is never less 
than 0.84, and as the estimation period lengthens, the correlations increase as well. In addition, 
although not reported in table 2, all measures of historical volatility are positively correlated with 
each other and with D/TA, and negatively correlated with MVE. (All correlations are significant 
at the 0.0001 level.) For example, the correlation between HIST M, 5 and MVE is —0.56, while 
that between HIST. M, 5 and D/TA is 0.06. The results for MVE and D/TA are consistent with 
prior research on the determinants of stock return volatility (Christie 1982; Karolyi 1993), and 
support the use of MVE and D/TA to select comparable firms. 


III. EMPIRICAL RESULTS 
Historical Forecast 


Our first research question examines the accuracy of predicting future volatility with a simple 
extrapolation of a target firm's historical volatility, where historical volatility is measured using 
daily, weekly, and monthly returns over periods of varying length.! The choice of return 
frequency and estimation period involves a tradeoff between minimizing sampling error (which 


? We compute FUTURE and HIST, f. y as the square-root of the unbiased estimate of the variance: 
] I 
o Tan D 


where r, is the continuously compounded return for day / week / month t and F is the mean of r, over days / weeks / months 
t= 1,...T. Although the estimate of 0? is unbiased, the estimate of o is biased. The correction factor for o is a (complicated) 
function of the gamma function, and the bias of o is negligible for all but very small values of T (Cox and Rubinstein 1985, 


256). 

9 The medians of FUTURE and MVE are 0.369 and $67.6 million, respectively, for the 8,815 firm-periods with HIST. f. 5 
available versus 0.509 and $19.5 million for the 5,036 firm-periods without HIST f 5 available. 

11 The empirical results discussed in the paper include the effect of the October 19, 1987 market crash. To assess the 
sensitivity of our conclusions to the occurrence of very extreme stock returns, we repeated our analysis excluding stock 
returns surrounding the market crash (daily returns for October 10 to 26, weekly returns for October 20 and 27, and 
monthly returns for October). The results are qualitatively very similar to those reported in the paper. When the crash 

(Footnote 11 continued on page 606) 
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TABLE 2 
Descriptive statistics of historical and future volatility, market value of equity, and debt 
to total assets (13,851 observations) 


Panel A: Fractiles 





Number 
Variable of Obs 259bile Median 7596ile 
HIST D 1 10787 0.267 0.376 0.527 
HIST W 1 10787 0.275 0.385 0.523 
HIST M 1! 10787 0.258 0.362 0.507 
HIST D 3 9413 0.271 "^ 0.373 0.515 
HIST W 3 9413 0.284 0.384 0.513 
HIST. M 3 9413 0.276 0.372 0.494 
HIST D 5 8815 0.274 0.377 0.517 
HIST W 5 8815 0.288 : 0.387 0.515 
HIST M. 5 8815 0.280 0.378 0.497 
HIST_D_7 7288 0.272 0.368 0.506 
HIST_W_7 7288 0.285 0.378 0.504 
HIST_M_7 7288 0.278 0.369 0.487 
FUTURE 13851 0.309 0.415 0.548 
MVE 13851 11.681 40.302 165.690 
D/TA 13851 0.106 0.242 0.399 


Panel B: Spearman (rank) correlations* 


Number of years (y) 


1 3 5 7 

Corr (HIST_D_y, HIST_W_y) 0.95 0.96 0.97 0.98 
Corr (HIST. D. y, HIST. M, y) 0.84 0.90 0.92 0.93 
Corr (HIST. W. y, HIST. M. y) 0.89 0.95 0.96 0.97 


* All correlations are significantly different than zero at the 0.0001 level, 


HIST. f. y is stock return volatility (standard deviation) of continuously compounded returns measured with frequency 
f over the y year(s) before the forecast date, f= D (daily), W (weekly), and M (monthly), and y= 1,3, 5, and 7, where 
at least 80 percent of the returns are available over the y year(s), expressed on an annual basis. 

FUTURE is stock return volatility over the forecast period, the five years (four years minimum) after the forecast date, 
measured using monthly returns, expressed on an annual basis. 

MVE is the market value of equity on the forecast date, in millions of dollars. 

D/TA is the ratio of book value of debt to book value of total assets at the end of the fiscal year before the forecast date. 
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TABLE 3 
Forecast accuracy of historical volatility HIST f y, f = D (daily), W (weekly), and M 


Medi 


(monthly), and y = 1, 3, 5, and 7 years (7,259 Observations) 





Forecast Median Fraction 90% Relative 
Method ERROR Positive IERRORI ERROR! Accuracy 
HIST D 1 0.016 0.522* 0.210 0.541 7 
HIST_W_1 0.013 0.517 0.200 0.516 6 
HIST_M_1 0.068 0.583* 0.235 0.560 8 
HIST_D_3 -0.011 0.484 0.193 0.508 6 
HIST W 3 —0.022 0.468* 0.174 0.464 1,2 
HIST M 3 0.022 0.534* 0.186 0.471 4,5 
HIST D. 5 —0.031 0.458* 0.193 0.506 5,6 
HIST W 5 -0.046 0.431* 0.173 0.467 I 
HIST M 5 -0.005 0.492 0.173 0.456 1,2,3 
HIST_D_7 -0.041 0.444* 0.195 0.521 6 
HIST_W_7 —0.055 0.420* 0.182 0.479 2,3 
HIST_M_7 -0.021 0.471* 0.180 0.469 3,4 


* Significantly different than 0.500 at the 0.002 level using a two-tailed binomial test. 

ERROR is the scaled volatility forecast error = (FUTURE - HIST. f£. y) / FUTURE, where FUTURE is stock return 
volatility over the five-year forecast period and HIST, f y is volatility for frequency f over the y-year period before the 
forecast date, f= D (daily), W (weekly), and M (monthly), and y = 1, 3, 5, and 7. 

[ERRORI is the absolute value of ERROR. 

Relative Accuracy is an ordinal rank of the forecast methods by IERRORI using the no Friedman Test, where 
forecast methods with different scores are statistically different from one another at the 0.002 level, and where a score 
of 1 indicates the most accurate forecast method. 


favors high frequency and a long period) and avoiding extreme returns or a large structural shift 
in the time-series of stock returns (which favors low frequency and a short period). 

In table 3, we compare the accuracy of predicting FUTURE with the 12 volatility measures 
HIST f y, where f denotes the return frequency and y represents the number of years used to 
compute historical volatility. To facilitate comparisons across the 12 volatility measures, the 
analysis is limited to the sample of firm-periods with complete data available. ERROR is the 
scaled forecast error, (FUTURE — HIST. f£ y) / FUTURE, and [ERRORI is the absolute value of 


(Footnote 11 continued) 

is excluded from the analysis, the accuracy of almost all of the forecast methods declines slightly, but the method’s relative 
accuracy is largely unchanged. This result is due primarily to a combination of two factors: First, except for the crash and the 
early 1980s, long-term (five-year) volatility has trended downward since the mid- 1970s; therefore, volatility pre- and post- 
crash is more similar when the crash is included than itis when the crash is excluded. Second, because the crash occurs towards 
the end of the sample period, considerably more forecast periods are affected than historical periods: 4,156 observations (all 
firm-periods with fiscal year ending April, 1982 to March, 1987) versus 79 observations (all firm-periods with fiscal year 
ending April to June, 1987). Consequently, the crash bas a much greater affect on the predicted variable than it does on the 
predictor variable. We thank an anonymous referee for suggesting this sensitivity analysis. 
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ERROR. Median ERROR and the fraction positive measure bias, and the median ERROR! and 
90th-percentile ERROR! summarize forecast accuracy (the 90th-percentile measures the disper- 
sion of the distribution of IERRORI). We compare forecast accuracy using an index of relative 
accuracy. The scores are assigned on the basis of IERRORI using the nonparametric Friedman 
Test, where forecast methods with different scores are statistically different from one another at 
the 0.002 level, and forecast methods with multiple scores are indistinguishable from two or more 
other methods that are each different from one another. A score of one indicates the most 
accurate forecast method. We conduct our tests at a somewhat conservative (and arbitrary) 0.002 
level of significance to mitigate the effects of possible cross-sectional dependence in our sample. 
(A p-value of 0.002 corresponds to a t-statistic of 3.09.) 

The results in table 3 reveal HIST W. 1, HIST D 3, and HIST. M, 5 are unbiased forecasts 
of FUTURE, while the fraction positive for the other forecasts is significantly different than 0.5 
at the 0.002 level. The most accurate forecasts areHIST W 3,HIST. W. S,andHIST M, 5;each 
is ranked one by the Friedman Test (again at the 0.002 level). In terms of both bias and accuracy, 
HIST M. 5 is the best forecast of FUTURE. with median IERROR! equal to 0.173. The 
corresponding median absolute percentage error in pricing a standard at-the-money option is 
9.0%. Overall, the evidence presented in table 3 suggests that when using a target firm's historical 
volatility to predict its future volatility over the next five years, monthly returns should be used 
and the historical period should be approximately five years. 


Comparable-Firms Forecast 


Our second research question explores the accuracy of predicting future volatility using a 
forecast based exclusively on historical volatilities of comparable firms. Our analysis provides 
evidence on the tradeoff between more but potentially irrelevant data for comparable firms. The 
ED on accounting for stock-based compensation suggests this forecasting technique for firms 
without sufficient data to compute historical volatility (FASB 1993, paragraph 195): 


In other circumstances, information on past experience may not be available. For 
example, an entity whose common stock has only recently become publicly traded will 
not have historical data on the volatility of its own stock. In that situation, expected 
volatility may be based on the average volatilities of similar entities... 


To provide evidence on the FASB recommendation, we examined forecast accuracy for six 
different methods of selecting comparable firms: 


MARKET: The universe of all eligible comparable firms (defined below). 

INDUSTRY: The subset of eligible firms with the same n-digit SIC code as the target firm, where 
nis chosen to be as large as possible such that at least ten comparable firms exist with the same 
n-digit SIC code.!^ For example, if there are fewer than ten eligible firms with the same 4- 


2 The Friedman test allows multiple comparisons of several related samples. For each firm-period, the various [ERROR] 
are ranked, and tests for differences between pairs of methods are based on the ranks for each method over the sample 
of firm-periods. Because the Freidman test is conducted using the ranks of the absolute scaled forecast errors, it ignores 
information about both the sign and magnitude of the scaled forecast errors; however, relative to parametric tests, tbe 
Freidman test makes fewer assumptions about the distribution of the scaled forecast errors, and inferences about 
differences in volatility are identical to inferences about differences in Black-Scholes option values since the Black- 
Scholes formula is monotonic in volatility. For a discussion of the Freidman test, see Conover (1980). 

P Given the well-documented association between accounting earnings and stock prices, we also tried picking comparable 
firms on the basis of the standard deviation of annual return on equity (net income divided by the book value of common 
equity) over the five years before the forecast date. However, this variable was missing for approximately 35 percent 
of our sample, and the results were no better than those for the much simpler methods reported below. 

^ Mining (1000-1499) and construction (1500-1999) are treated as distinct “one”-digit industries. 
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digit SIC code, then 7 is set to 3, and if there are fewer than ten eligible firms with the same 
3-digit SIC code, then n is set to 2. This process is continued until a value of n is found such 
that at least ten comparable firms exist with the same n-digit SIC code.!5 


MVE: The ten eligible firms most similar to the target firm in terms of MVE on the forecast date. 

D/TA: The ten eligible firms most similar to the target firm in terms of D/TA at the end of the fiscal 
year before the forecast date. 

INDUSTRY+MVE: The ten eligible firms with the same n-digit SIC code that are most similar 
to the target firm in terms of MVE, where n is chosen to be as large as possible such that at 
least 30 comparable firms have the same n-digit SIC code. 

INDUSTRY+D/TA: The ten eligible firms with the same n-digit SIC code that are most similar 
to the target firm in terms of D/TA, where n is chosen to be as large as possible such that at 
least 30 comparable firms have the same n-digit SIC code. 


The first approach, MARKET, is a naive procedure that serves as a benchmark for the other 
methods. Industry membership is a widely used matching variable in accounting research, and it 
captures several potential determinants of volatility (Lev 1983): product type (durables vs. 
nondurables), competition (i.e., barriers-to-entry), and capital intensity. Finally, as discussed 
earlier, MVE (firm size) and D/TA (capital structure) have been shown to affect volatility 
(Christie 1982; Karolyi 1993). 

The universe of eligible comparable firms—the sample from which comparable firms are 
chosen—-must satisfy the first three selection criteria for the target forecast sample (Compustat 
and CRSP coverage; fiscal year ending between December, 1966 and June, 1987; and MVE and 
D/TA available). Eligible firms also must have at least 48 monthly returns available during the 
60-month (five-year) period preceding the target firm-period's forecast date.!* Finally, the fiscal 
year of eligible firms must end at least six months—and not more than 17 months—before the 
forecast date. As explained earlier, a six-month forecast lag ensures accounting data are publicly 
available, while a 17-month cutoff leads to a one-year window from which to match comparable 
firms. Regardless of the actual fiscal year end, volatility for the comparable firms is measured over 
the same five-year estimation period preceding a target firm’s forecast date. 

We define C(F) to be the volatility forcast for the set of comparable firms F, where F = 
MARKET, INDUSTRY, MVE,..., and we measure C(F) by the median five-year historical 
volatility of monthly returns for the set F. We use the median to form the volatility forecasts since 
this measure of central tendency is not sensitive to outliers. 

Results for the six comparable-firms forecasts and the daily, weekly, and monthly five-year 
historical forecasts HIST D 5, HIST W 5, and HIST_M_S are presented in table 4. The 
historical forecasts serve as a benchmark for the comparable-firms forecasts, and are computed 
using all available returns during the five-year period. Table 4 is divided into four panels 
according to the number of monthly returns available to compute historical volatility: 48—60, 18— 
47, 3-17, and 0-2 (panels A—D, respectively). Forty-eight months corresponds to the minimum 
number of observations we require to compute five-year volatility, while 18 and three are arbitrary 
cutoffs. Partitioning the results in this way provides insight on the relative accuracy of the 
historical forecasts and the comparable-firms techniques as the number of returns used to 
compute historical volatility declines. 


51f the process culminates with n = 0 for a particular observation, then INDUSTRY and MARKET are identical. 

!5In contrast to the sample of target firm-periods, we do not require that FUTURE be available for the eligible firms. As 
aresult, the target firm-periods and the eligible firms are not subsets of one another. (A target without historical volatility 
cannot be a comparable, while a comparable without future volatility cannot be a target.) 
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Table 4 reveals that as the number of months decreases, the accuracy of the comparable-firms 
forecasts relative to the historical forecasts increases, and the accuracy of HIST_M_5 relative to 
HIST_W_5 declines. When at least 48 months are available (panel A), HIST_W_5 and 
HIST M. 5 are the most accurate forecasts (the only forecasts with relative accuracy equal to 
one), and of these two forecasts, only HIST. M. 5 is unbiased. Its median and 90th-percentile 
[ERRORI are 0.178 and 0.464, respectively, and its median and 90th-percentile absolute 
percentage error in pricing a standard at-the-money option (not reported in table 4) are 9.3 and 
24.8%, respectively. The results in panel A are inconsistent with the opinion of some commen- 
tators that historical volatility is unrelated to future volatility.” If this view were true, the historical 
forecasts would be no more accurate than CMARKET), a naive (uninformed) forecast. 

As the number of historical observations declines, however, the relative superiority of 
HIST_M_5 gives way to HIST_W_5 and the comparable-firms forecast CINDUSTRY+MVE). 
When there are 18—47 months available (panel B), HIST D 5, HIST W 5, CINDUSTRY) and 
CCINDUSTRY+MVBE) are all ranked first, but CINDUSTRY+MVE) is the only unbiased 
forecast of the four. When there are only 3—17 observations available (panel C), HIST W. 5 and 
CUNDUSTR Y4MV B) are statistically indistinguishable from one another (the relative accuracy 
is one for both forecasts), and both forecasts are biased. Finally, when fewer than three months 
of returns are available (panel D), we do not compute past volatility, and the most accurate forecast 
is CINDUSTRY-MVE).!? The median and 90th-percentile IERRORI in this case are 0.238 and 
0.497, and the corresponding absolute option pricing errors are 14.4 and 31.396. These latter 
results provide directevidence on the accuracy of the FASB suggestion of using comparable firms 
to predict volatility for firms lacking historical stock prices. 

The best forecasting method when historical data are not available (panel D) is 
C(IINDUSTRY-4MVE); however, this forecast is biased (at the 0.002 level), tending to underes- 
timate FUTURE. The median ERROR and fraction positive for CINDUSTRY+MVE) are 0.158 
and 0.700, and the median signed mispricing of a standard at-the-money option for this forecast 
is 9.796. Therefore, the FASB suggestion (as we implement it) of using historical volatilities of 
comparable firms to predict volatility for firms without historical data underestimates future 
volatility, on average.!° 

Another feature of table 4 is the inverse relation between the number of returns available to 
compute historical volatility and forecast accuracy. The median ERROR! for the most accurate 
forecast method in each panel equals 0.177, 0.216, 0.224, and 0.238 as the number of months takes 
on the values 48—60, 18-47, 3-17, and 0—2, respectively. This result is due, in part, to a negative 
relation between number of months and firm size together with a negative relation between firm 
size and IERRORI. The median MVE is $67.6 million for the firm-periods with the most return 
data (48—60 months) versus $25.3 million for the firm-periods with the least (0—2 months). 
Moreover, for the subset of firm-periods with at least 48 months available (panel A), the rank 


In a comment letter submitted to the FASB, The Wyatt Company (1994b) argues, "In most cases, historical volatility 
is irrelevant, or at best a poor predictor of future volatility." 

VW For the subset of observations in panel D with (an arbitrary) 18 or more daily returns available to compute HIST D 5, 
C(INDUSTRY -MVE) is more accurate than HIST. D. 5, 

19 The results in panel D of table 4 are significant to the controversy surrounding the ED on stock-based compensation. 
Some commentators cite difficulty estimating volatility as a reason for their opposition to the ED. Without an agreed- 
upon banchmark, our tests cannot address this argument; however, our results do indicate that if volatility estimates were 
required for firms without historical data, a comparable-firms forecast like that suggested by the FASB, while greater 


than zero, would still be too low in most cases (and would therefore underestimate compensation expens&) 3^ E 
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TABLE 4 


Forecast accuracy of five-year historical volatility HIST D 5, HIST W 5, and 
HIST. M 5 and comparable firms volatility C(F), by number of observations available 
to compute historical volatility (13,851 observations) 





Forecast Median Fraction Median 9096 Relative 
Method ERROR Positive IERRORI IERROR| Accuracy 
Panel A: 48—60 months (8,815 Observations) 

HIST D 5 —0.030 0.459* 0.194 0.511 2 
HIST W 5 —0.047 0.430* 0.177 0.472 1 
HIST M. 5 —0.009 0.486 0.178 0.464 1 
C(MARKET) -0.045 0.461* 0.270 0.727 4 
CANDUSTRY) -0.015 0.482* 0.227 0.577 3 
C(MVE) -0.013 0.484 0.226 0.652 3 
C(D/TA) -0.034 0.468* 0.269 0.743 4 
CANDUSTRY+MVBE)} -0.002 0.498 0.201 0.530 2 
CANDUSTRY+D/TA) -0.002 0.497 0.229 0.589 3 
Panel B: 18-47 months (1,137 Observations) 

HIST D.5 —0.092 0.383* 0.216 0.572 1 
HIST W 5 -0.132 0.331* 0.228 0.603 1,2 
HIST_M_5 -0.143 0.337* 0.244 0.671 2,3 
C(MARKET) 0.090 0.592* 0.278 0.632 4 
CANDUSTRY) 0.044 0.551* 0.234 0.579 1,2 
C(MVE) —0.066 0.440* 0.256 0.786 3,4 
C(D/TA) 0.090 0.582* 0.274 0.637 4 
CUNDUSTRY+MVE) —0.028 0.470 0.228 0.628 I 
CUNDUSTRY+D/TA) 0.061 0.573* 0.237 0.588 1,2,3 
Panel C: 3—17 months (2,918 Observations) 

HIST D. 5 0.081 0.608* 0.226 0.565 2.3 
HIST W. 5 0.040 0.540* 0.224 0.574 1,2 
HIST. M, 5 0.107 0.602* 0.284 0.659 4 
C(MARKET) 0.277 0.804* 0.319 0.575 5 
CUNDUSTRY) 0.184 0.721* 0.256 0.543 3 
C(MVE) 0.167 0.703* 0.258 0.542 3 
C(D/TA) 0.257 0.766* 0.319 0.603 5 
C(NDUSTR Y -4-MVE) 0.126 0.671* 0.226 0.523 | 
CAUNDUSTRY+D/TA) 0.208 0.736* 0.282 0.567 4 
Panel D: 0-2 months (981 Observations) 

HIST. D. 5 — -— - _ - 
HIST W 5 - ~ — - — 
HIST_M_5 - - - - — 
CMARKET) 0.308 0.857* 0.335 0.547 3 
CANDUSTRY) 0.199 0.769* 0.261 0.521 2 
C(MVE) 0.204 0.743* 0.266 0.520 2 
C(D/TÀ) 0.271 0.805* 0.319 0.590 3 
CüNDUSTR Y --MVE) 0.158 0.700* 0.238 0.497 1 
CUNDUSTRY+D/TA) 0.205 0.760* 0.265 0.542 2 


* Significantly different than 0.500 at the 0.002 level using a two-tailed binomial test. 
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ERROR is the scaled volatility forecast error = (FUTURE — FORECAST) / FUTURE, where FUTURE is volatility over 
the five-year forecast period and FORECAST is either HIST. f. 5, f= D (daily), W (weekly), and M (monthly), or C(F), 
the median historical volatility for the set of comparable firms F (defined below). 

[ERRORI is the absolute value of ERROR. 

Relative Accuracy is an ordinal rank of the forecast methods by IERRORI using the nonparametric Friedman Test, where 
forecast methods with different scores are statistically different from one another at the 0.002 level, and where a score 
of 1 indicates the most accurate forecast method. 

HIST. f 5is stock return volatility for frequency f, f= D (daily), W (weekly), and M (monthly), computed using all returns 
available over the five years (six months, 26 weeks, or 125 days minimum) before the forecast date, expressed on an 
annual basis. 

C(F) is the median monthly volatility over the five years before the forecast date for the set of comparable firms F, where 
F is selected from all firms on the merged annual Compustat file and the NYSE, ASE, or NASDAQ CRSP files with 
fiscal year ending 6-17 months before the forecast date, with 5-year monthly historical volatility HIST M 5 available 
(48 months minimum), with market value of equity available on the forecast date, and with debt to total assets available 
at the end of the fiscal year. F takes on the values: 

MARKET: the universe of all eligible comparable firms. 

INDUSTRY: the subset of all eligible firms in the same n-digit industry, n = 4, 3, 2, 1, 0, where n is chosen to be as large 
as possible such that at least ten comparable firms are in the same n-digit industry. 

MVE: the ten eligible firms most/similar in terms of market value of equity on the forecast date. 

D/TA: the ten eligible firms most similar in terms of the ratio of book value of debt to book value of total assets at the 
end of the most recent fiscal year before the forecast date. 

INDUSTRY+MVE: the ten eligible firms most similar in terms of market value of equity in the same n-digit industry, 
n= 4, 3, 2, 1, 0, where n is chosen to be as large as possible such that at least 30 comparable firms are in the 
same n-digit industry. 

INDUSTRY+D/TA: the ten eligible firms most similar in terms of the ratio of book value of debt to book value of total 
assets in the same n-digit industry, n = 4, 3, 2, 1, 0, where n is chosen to be as large as possible such that at least 
30 comparable firms are in the same n-digit industry. 


correlations between MVE and IERRORI for HIST. D 5, HIST W. 5, and HIST M, 5 are 
—0.146, —0.095, and —0.053, which are each significant at the 0.0001 level. 

To provide a better appreciation for the relation between firm size and forecast accuracy, 
table 5 presents results for HIST D 5, HIST W. 5, and HIST. M 5 separately for each MVE 
equity quintile (formed annually). To conserve space, only results for firm-periods with 48—60 
months are presented. The analysis reveals that accuracy increases as firm size increases, and that 
the increase is most dramatic for daily volatility. The median [ERRORI (median absolute option 
pricing error, not reported) for HIST. D. 5 declines from 0.237 (13.396) for the first quintile (the 
smallest firms) to 0.167 (7.796) forthe fifth quintile (the largest firms). The corresponding values 
for HIST M 5 are 0.189 (10.8%) and 0.167 (7.896). 


Shrinkage Forecast 


Our final research question involves the accuracy of a shrinkage forecast formed by 
combining a historical forecast with a comparable-firms forecast. The FASB suggests this 
approach in the ED on accounting for stock-based compensation (FASB 1993, paragraph 195): 


...an entity whose common stock has been publicly traded for only a few years and has 
generally become less volatile as more trading experience has been gained...might 
consider the stock price volatilities of similar entities. 


Formal justification for this technique is provided by the statistics literature on shrinkage 
estimators, which adjust the mean of a subsample towards the mean of the overall sample (Stein 
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TABLE 5 
Forecast accuracy of historical five-year volatility HIST_D_5, HIST_W_5, and 
HIST M_5 for all observations with at least 80 percent of the returns available to 
compute volatility (1,010 days, 209 weeks, and 48 months), by market value of equity 


quintile (8,815 observations) 

MVE Median Forecast Median Fraction Median 90% 
Quintile MVE Method ERROR Positive IERROR! |IERRORI 
1 5.9 HIST_D_5 —0.132 0.357* 0.237 0.675 

HIST. W 5 —0.096 0.372* 0.199 0.544 
HIST M 5 —0.040 0.446* 0.189 0.504 
2 23.3 HIST_D_5 —0.088 0.391* 0.214 0.596 
HIST W 5 —0.087 0.377* 0.194 0.496 
HIST. M 5 —0.049 0.432* 0.187 0.497 
3 67.6 HIST D 5 —0.020 0.472 0.183 0.469 
HIST_W_5 —0.040 0.443* 0.171 0.462 
HIST M 5 —0.016 0.470 0.177 0.465 
4 214.6 HIST D 5 0.014 0.522 0.183 0.434 
HIST W 5 —0.022 0.462* 0.164 0.436 ` 
HIST_M_5 0.006 0.510 0.170 0.445 
5 1001.7 HIST_D_5 0.035 0.555* 0.167 0.402 
HIST_W_5 —0.003 0.495 0.160 0.407 
HIST. M 5 0.046 D.572* 0.167 0.424 


* Significantly different than 0.500 at the 0.002 level using a two-tailed binomial test. 

MVE quintiles are determined separately for each calendar year for all observations with at least 80 percent of the returns 
available to compute historical volatility. l 

MVE is the market value of equity on the forecast date, in millions of dollars. 

ERROR is the scaled volatility forecast error = (FUTURE - HIST f 5) / FUTURE, where FUTURE is volatility of 
monthly returns over the five-year forecast period and HIST. f 5 is volatility of frequency f returns over the five years 
before the forecast date, f= D (daily), W (weekly), and M (monthly). 

[ERRORI is the absolute value of ERROR. 


1955; James and Stein 1961; Efron and Morris 1976). Shrinkage estimators are related to 
Bayesian procedures, which adjust the subsample means toward the mean of a prior distribution; 
as the number of subsamples increases, a shrinkage estimator converges to a Bayesian estimator. 
Geske and Roll (1984) and Karolyi (1993), respectively, develop and test a shrinkage estimator 
and Bayesian estimator of volatility for exchange-traded options; however, neither study 
considers long-term volatility for pricing convertible securities and employee stock options. In 
contrast, The Crystal Report (1994) presents a shrinkage estimator of long-term volatility, but it 
does not provide evidence on the accuracy of the estimator. 

Indirect empirical support for a shrinkage estimator of long-term volatility is provided by two 
correlations in our data. The first is the Spearman rank correlation between HIST. M, 5 and its 
forecast error, (FUTURE — HIST. M 5) / FUTURE, and the second is the rank correlation 
between HIST M. 5 and the forecast error of the comparable-firms forecast CMARKET). For 
the sample of firm-periods with at least 48 months of data, the first correlation is 20.411, which 
indicates there is mean reversion in long-term volatility, a result consistent with The Crystal 
Report (1994): When historical volatility is large (small), future volatility tends to be smaller 
(larger) than historical volatility. The second correlation is 0.661, which indicates the mean 
reversion is not complete: When historical volatility is large (small), future volatility remains 
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large (small) relative to the median (historical) volatility for all firms. These results indicate that 
historical and comparable-firms forecasts tend to err in opposite directions, and therefore a 
composite forecast could outperform each individual forecast. 

We let S(F) denote the shrinkage forecast for the set of comparable firms F, where F = 
. MARKET, INDUSTRY, MVE,... (as before), and we measure S(F) as the average of the 
historical forecast HIST W 5 and the comparable-firms forecast C(F): S(F) = 34 xHIST, W. 5 
+% x C(F). We use HIST W. 5 to compute the shrinkage forecast since it performs well in all 
four panels of table 4 (determined by the number of past months available). The results, together 
. with those for the three historical forecasts (which serve as benchmarks), appear in table 6. Like 
table 4, table 6 is partitioned by the number of months available to compute historical volatility: 
48—60, 18—47, and 3-17 (panels A-C, respectively). Results for 0-2 months are not presented 
since we do not compute HIST. W 5—-and hence S(F)—for this partition. 

The evidence in table 6 reveals that a shrinkage forecast is more accurate than a historical 
forecast. For all three subsamples, SINDUSTR Y+MVE) is ranked first, while the rank of the best 
historical forecast in each panel varies between two and five. The median [ERRORI for 
HIST. W. 5, for instance, and SINDUSTRY+MVE) are 0.177 and 0.165 in panel A, 0.228 and 
0.195 in panel B, and 0.224 and 0.181 in panel C. Like previous research on predicting short-term 
volatility for exchange-traded stock options (e.g., Geske and Roll 1984; Karolyi 1993), the results 
in table 6 provide strong support for using volatilities of comparable firms to help predict future 
long-term volatility, not only for target firms with little historical data available, butfor firms with 
extensive historical data as well. 

As an additional check on our results, we tested two alternative shrinkage forecasts 
determined by changing the relative weights on the historical and comparable firms forecasts 
from (14, 15) to (14, 34) and (4, 14). Specifically, we examined S'(F) =x HIST W. 5-34 xC(F) 
and S'(F) 34 x HIST_W_5 + % x C(F). Although we conducted no formal statistical tests for 
differences in accuracy, the results (not reported) indicate S (F) is only slightly (but consistently) 
less accurate than S(F) for all three subsamples (corresponding to panels A-C in table 6). For 
instance, even when only 3-17 months of past data are available to compute HIST W 5, the 
median IERRORI for S'('INDUSTRY MV E) is 0.188 versus 0.181 for SINDUSTRY+MVE) in 
panel C of table 6. Similarly, S^(F) is slightly less accurate than S(F) when fewer than 48 months 
of past data are available but just as accurate when at least 48 months of past data exist. For 
instance, when there are 3-17 months of past returns, the median IERRORI for 
S^ (INDUSTRY *MVE) is 0.194, but when the number of past returns is 48 or more, the median 
IERRORI of S"(INDUSTRY--MVE) and S(INDUSTRY MV E) are both 0.165. The results 
suggest the weight on historical volatility should increase as the amount of past data increases, 
but that accuracy is not highly sensitive to alternative weighting schemes. 


IV. MATERIALITY OF VOLATILITY FORECAST ERRORS 


In this section, we examine the effect on net income of errors in pricing employee stock 
options due to errors in forecasting volatility. Our analysis attempts to address whether employee 
stock options can be valued with sufficient precision to support recognition for financial reporting 
purposes; i.e., whether errors in pricing employee stock options are material. The effect on net 
income of errors in forecasting volatility depends on two factors: the magnitude of the option 
valuation error and the (estimated) value of employee stock options relative to net income. Taking 
the other parameters in the option pricing model as given, the magnitude of the option valuation 
error depends solely on the magnitude of the error in forecasting volatility. 
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TABLE 6 


Forecast accuracy of five-year historical volatility HIST D 5, HIST W 5, and 
HIST. M, 5 and shrinkage volatility S(F) for all observations with returns available to 
compute historical volatility (at least 42 days, nine weeks, or two months), by number of 
observations available to compute historical volatility (12,148 observations) 


Forecast Median 
Method ERROR 


Fraction Median 90% 
Positive IERRORI 


Relative 
IERROR| Accuracy 


Panel A: 48-60 months (8,815 Observations) 


HIST_D_S —0.030 0.459* 0.194 0.511 5 
HIST W 5 —0.047 0.430* 0.177 0.472 2,3 
HIST. M, 5 -0.009 0.486 0.178 0.464 3,4 
S(MARKET) -0.063 0.406* 0.183 0.508 3,4 
SANDUSTRY) -0.044 0.436* 0.173 0.456 2 
S(MVE) -0.047 0.425* 0.174 0.480 2 
S(D/TA) -0.057 0.419* 0.184 0.524 4 
SANDUSTRY+MVE) —0.031 0.449* 0.165 0.437 1 
S(INDUSTRY -DTA) -0.034 0.445* 0.174 0.453 2 
Panel B: 18-47 months (1,137 Observations) 

HIST_D_5 -0.092 0.383* 0.216 0.572 3 
HIST.W 5 -0.132 0.331* 0.228 0.603 3,4 
HIST. M..5 —0.143 0.337* 0.244 0.671 4 
S(MARKET) -0.041 0.434* 0.182 0.535 1.2 
SINDUSTRY) —0.062 0.417* 0.190 0.509 1 
S(MVE) -0.110 0.331* 0.197 0.609 2 
S(D/TA) —0,036 0.441* 0.191 0.530 1.2 
S(INDUSTRY--MVE) -0.104 0.364* 0.195 0.529 1,2 
SANDUSTRY-+DTA) -0.043 0.432* 0.185 0.515 1 
Panel C: 3—17 months (2,918 Observations) 

HIST. D. 5 0.081 0.608* 0.226 0.565 6 
HIST W 5 0.040 0.549* 0.224 0.574 5,6 
HIST_M_5 0.107 0.602* 0.284 0.659 -7 
S(MARKET) 0.128 0.688* 0.203 0.465 4 
SANDUSTRY) 0.091 0.637* 0.187 0.442 2,3 
S(MVE) 0.081 0.626* 0.187 0.455 2 
S(D/TA) 0.123 0.680* 0.205 0.474 4,5 
SAUNDUSTRY+MVE) 0.063 0.599* 0.181 0.439 1 
SANDUSTRY+DTA) 0.099 0.651* 0.191 0.449 . 3 


* Significantly different than 0.500 at the 0.002 level using a two-tailed binomial test. 


ERROR is the scaled volatility forecast error = (FUTURE — FORECAST) / FUTURE, where FUTURE is volatility. of 
monthly returns over the five-year forecast period and FORECAST is either HIST D 5, HIST W 5, HIST M 5,or 
SCF), the shrinkage volatility forecast, the average of HIST W 5 and C(F), the median historical volatility for the set 
of comparable firms F (defined in the notes to table 3): 
S(F) = y x HIST W 5 + '4xC(F) 

[ERRORI is the absolute value of ERROR. i 

Relative Accuracy is an ordinal rank of the forecast methods by ERROR! using the nonparametric Friedman Test, where 
forecast methods with different scores are statistically different from one another at the 0.002 level, and where a score 


of 1 indicates the most accurate forecast method. 
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More formally, let NI ERR denote the percentage error in pro forma net income, NI, (net 
income after subtracting the value of employee stock options). Then 


(1) 
NI ERR = -OP ERR (L—9 OF-YAE. 
NI. 
where 
OP ERR = the signed (pre-tax) error in pricing long-term options, expressed as a percent- 
age of predicted option values;? 

T = the combined state and federal income tax rate (expressed as a fraction); and 
OP VAL = the predicted option value (computed using predicted volatility) recognized as 


an expense in computing NI. 
To simplify equation (1), we define R to be the percentage effect of subtracting the value of 
employee stock options from net income under current accounting standards (without an expense 
for employee stock options issued at-the-money), NI, expressed as a positive fraction: R = 
(1 — 1) OPVAL / NI... Then NI, = (1 -R ) NI pny and equation (1) becomes 





_ NL ERR = - OP ERR R j (2) 
(1-R) 

Our research indicates that, depending on firm size and the amount of historical data 
available, a typical (median) absolute error in predicting volatility is approximately 16 to 24 
percent, and an extreme (90th-percentile) absolute error is about 40 to 51 percent. A typical error 
in pricing a standard at-the-money option as a fraction of the predicted option value, OP_ERR, 
is somewhat lower, say seven to 15 percent, while an extreme error is more like 20 to 25 percent. 
Information on the distribution of R, the second factor affecting the impact of errors in forecasting 
volatility on net income, is not available for a broad cross-section of firms; however, some 
information is provided in the report on the results of the FASB field test of the ED on stock-based 
compensation (FASB 1994a, table 2). For the 25 companies participating in the field test, the 
minimum, median, and maximum percentage effect on net income or net loss in 1992 (after full 
phase-in of the ED) are 0.53, 3.58, and 70.99%, respectively.“ The corresponding values of 
INI_ERRI from equation (2) assuming a typical absolute option pricing error of 15 percent are 0.1, 
0.6, and 36.7%. The numbers for an extreme absolute option pricing error of 25 percent are 0.1, 
0.9, and 61.2%. Although an absolute error of 61.2% would likely be material (depending on the 
magnitude of net income), a 70.99% reduction in net income is unusual; the second-largest 
reduction in net income for the 25 firms in the field test is 33.06%. In contrast, the error in net 
income for the median firm in the field test—even assuming an extreme (90th-percentile) absolue 
error in compensation expense—is 0.9%, which generally would not be considered material.” 


20 We scale the error by the option value computed using predicted rather than actual (i.c., realized) volatility since we wish 
to examine the effect of option valuation errors on pro forma net income, which is computed using predicted volatility. 

2 We use R in our model in keeping with the report on the results of the field test of the ED on stock-based compensation 
(FASB 1994a, table 2). 

2 Additional information on the distribution of the effect cf expensing employee stock options is provided in the 
ShareData field study of the ED on stock-based compensation (ShareData 1994, 9). The 30 companies in the ShareData 
sample are smaller on average than the 25 companies in the FASB sample, and for this reason some commentators prefer 
the ShareData field study. For this sample, the minimum, median and maximum percentage effect on nst income or net 
loss in 1992 are 0.67, 9.70 and 102.02%, respectively. 

D Evidence in Boatsman and Robertson (1974) suggests five percent is a reasonable materiality threshokd. The report on 
the field test suggests R is less than ten percent for most (84 percent) of the firms. Under this pair of assumptions, the 
absolute option pricing error IJOP. ERRI would have to be 45 percent to have a material effect on net income; however, 
45 percent is large relative to the errors encountered in our analysis. 
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V. SUMMARY 


In this paper, we investigate empirically the prediction of five-year volatility of monthly 
stock returns for purposes of pricing long-term equity derivatives. We examine three general 
approaches: a historical forecast (a firm’s past volatility), a comparable-firms forecast (the 
median historical volatility for a set of comparable firms), and a shrinkage forecast (a historical 
forecast adjusted towards a comparable-firms forecast). Our analysis provides guidance to firms 
selecting an optimal capital structure (e.g., by helping to structure and value convertible 
securities) or developing efficient compensation plans (e.g., by helping to design and price 
employee stock options). Also, by documenting the accuracy of several volatility estimation 
techniques and by assessing the materiality of volatility forecast errors, our analysis provides 
guidance to policy makers working to develop financial reporting standards for equity deriva- 
tives, to accountants and managers responsible for conforming to the standards, and to investors 
and accounting researchers interested in evaluating the impact of the standards. 

Our major results include the following: When using a historical forecast to predict five-year 
monthly volatility, historical volatility should be computed using either weekly or monthly 
returns, and the estimation period should be approximately five years. If a historical forecast 
exists, it generally performs better than a comparable-firms forecast (at least those we examine). 
If data to compute a historical forecast do not exist, then picking comparable firms on the basis 
of industry and firm size works best. Finally, when there are some historical data available to 
compute past volatility, a shrinkage forecast is more accurate than either a historical or 
comparable-firms forecast. Taken together, our results suggest that if the value of employee stock 
options were recognized as an expense, errors in pricing employee stock options due to errors in 
predicting long-term volatility would rarely have a material effect on net income. 
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ABSTRACT: A general prediction from the economic theory of tax reporting is that 
taxpayers will report more income as the tax rate increases, but the related empirical 
evidence has been mixed. We conducted an experiment to examine whether 
taxpayers' responses to a tax-rate change depend on both economic effects and 
perceptions of horizontal and exchange Inequity. Our findings reconcile the previ- 
ously inconsistent empirical resuits by Identifying conditions under which percep- 
tlons of inequity drive taxpayers’ reporting decisions. In summary, subjects reported 
less (more) income as tax rates Increased (decreased) when they were inequitably 
treated relative to others, but not when they were equitably treated relative to others. 


Key Words: Tax rates, Horizontal equity, Exchange equity, Tax evasion, Tax 
compliance. 
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I. INTRODUCTION 


general prediction from the economic theory of income tax reporting is that taxpayers 
have incentives to report more (less) income as tax rates increase (decrease). Most field 
studies (e.g., Clotfelter 1983; Poterba 1987) and experimental studies (e.g., Baldry 1987; 
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Benjamini and Maital 1985; Collins and Plumlee 1991) have found the opposite result. On the 
other hand, Beck et al. (1991) found support for the prediction of the economic model. We argue 
that the economic model’s prediction is generally not confirmed because, in addition to economic 
incentives, perceptions of both horizontal and exchange inequity influence taxpayers’ reporting 
behavior (see Jackson and Milliron, 1986 for a discussion of other nonmonetary factors affecting 
tax reporting). Horizontal inequity arises when taxpayers perceive that they are treated inequita- 
bly relative to other taxpayers with the same income, expenses, etc. Exchange inequity arises 
when taxpayers perceive that their exchange with the government is inequitable, i.e, when they 
believe that the taxes they pay exceed the benefits of the public goods and services they receive. 
Equity theory (Adams 1965; Walster et al. 1978; Walster et al. 1973) predicts that individuals who 
perceive such inequities will act to rectify them. In the tax literature, the usual interpretation of 
equity theory is that taxpayers will report less income as such inequities increase (Lewis 1982). 

We conduct an experiment which manipulates both horizontal inequity (by varying whether 
subjects’ tax rates are higher than or equal to those of other taxpayers) and exchange inequity (by 
varying tax rates while holding public goods and services constant). We find that, in the presence 
of horizontal inequity, subjects respond to an increase in exchange inequity (resulting from a tax- 
rate increase) by reporting less income. In the presence of horizontal inequity, the effect of the 
increased exchange inequity appears to dominate the effects of the economic incentives 
associated with a tax-rate increase. This result is consistent with the usual empirical finding, but 
not with the prediction from the economic model of tax reporting. In contrast, in the presence of 
horizontal equity, subjects do not significantly change the amount of income they report as the 
tax rate increases. Subjects react less to the increase in exchange inequity associated with a tax- 
rate increase, apparently because they realize that all other taxpayers face the same tax-rate 
increase. Thus, in the presence of horizontal equity, the effect of the increased exchange inequity 
no longer dominates the effect of the economic incentives associated with a tax-rate increase. 

These results reconcile previously inconsistent empirical findings. Beck et al. (1991) used an 
abstract experimental setting in which subjects were, by design, very unlikely to perceive either 
horizontal or exchange inequity. In the absence of such perceptions, only the economic incentives 
influenced subjects’ reporting behavior. In contrast, all other previous empirical studies were set 
in a tax context in which perceptions of horizontal and exchange inequity were likely to be a part 
of taxpayers’ mental scripts. Evidence for this contention comes from a wide range of survey 
research which suggests that such perceptions of inequity influence taxpayers’ behavior in actual 
tax environments. While the economic model’s predictions may be supported in abstract 
experimental settings, some additional factor such as equity may be required to explain taxpayer 
behavior in more realistic tax settings. 

Equity considerations are also potentially important in other accounting settings. For 
example, the cost to corporate headquarters (in terms of distortions of otherwise optimal risk- 
sharing arrangements) of obtaining truthful reports from unit managers may depend on whether 
the managers perceive that corporate headquarters is using the information to treat them equitably 
relative to other managers. While our experiment provides no direct evidence about such multi- 
unit organizational environments, the similarities between such settings and the tax-reporting 
environment suggest that consideration of equity issues in corporate settings may be warranted. 
This view is consistent with Milgrom and Roberts’ (1992, 418) suggestion that equity consider- 
ations are likely to play an important role in various organizational contexts. 

Section II of this paper describes the predictions of the economic theory of tax reporting with 
respect to tax-rate changes. In section III, we develop and specify our hypothesis. Section IV 
describes our experiment and section V reports our results. We discuss the implications of our 
findings in section VI. 
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II. ECONOMIC THEORY OF TAX REPORTING 


The traditional economic approach to modeling tax reporting takes the IRS audit policy as 
fixed and treats the taxpayer as an expected utility maximizer who chooses how much income to 
report by trading off the expected cost of penalties versus the expected benefit of paying less tax. 
The basic model, developed by Allingham and Sandmo (1972) and Yitzhaki (1974), demonstrates 
that with decreasing or constant absolute risk aversion, an increase (decrease) in the tax rate will 
produce an increase (decrease) in reported i income. This prediction reflects a riskiness effect and 
a wealth effect. The riskiness effect arises because the riskiness of the gamble associated with 
underreporting increases as the tax rate increases.! That is, for any initial level of reported income, 
the spread between the taxpayer's after-tax wealth if audited versus if not audited increases with 
the tax rate. Therefore, the riskiness effect predicts that any risk averse taxpayer will report more 
income as the tax rate increases. 

An increase in the tax rate also reduces the taxpayer’s wealth. Therefore, with decreasing 
(constant) absolute risk aversion, the wealth effect predicts a further increase (no change) in the 
taxpayer’s reported income. Thus, the combination of the riskiness and wealth effects predicts 
that an increase (decrease) in the tax rate produces an increase (decrease) in reported income. 

A number of refinements to the basic model have been introduced, including: 1) treating the 
IRS as astrategic player (Graetz et al. 1986); 2) allowing for taxpayer uncertainty about the actual 
tax liability (Alm 1988); 3) making the taxpayer’s hours of labor, and hence income, endogenous 
(Pencavel 1979); and 4) incorporating public goods financed by tax revenues (Cowell and Gordon 
1988).? The predicted effect of a tax-rate change on reported income is unchanged by these 
refinements except under certain special conditions in the final refinement involving public 
goods. However, Cowell and Gordon (1988) themselves describe this analysis as unsatisfactory 
and suggest that further work in the area is required. Therefore, the general prediction from the 
existing economic models of tax reporting is that a tax-rate increase (decrease) will result in more 
(less) reported income. 


III. HYPOTHESIS DEVELOPMENT 


The results of virtually all field studies (e.g., Clotfelter 1983; Crane and Nourzad 1986; 
Poterba 1987; Witte and Woodbury 1985) and laboratory experiments (e.g., Baldry 1987; 
Benjamini and Maital 1985; Collins and Plumlee 1991; Friedland et al. 1978) indicate that, 
contrary to the predictions of the economic model of tax reporting, tax-rate increases (decreases) 
are associated with decreased (increased) reported income. Although some of the these studies 
allude to the potential importance of equity perceptions in taxpayers' decisions (e.g., Baldry 1987, 
378; Friedland et al. 1978, 115), none of them was designed to test whether equity perceptions 
affect taxpayers' reporting behavior. | 

A possible explanation for the results of the tax-rate studies described above is that the 
combined effects of perceptions of horizontal inequity (Dean et al., 1980) and exchange inequity 
(Spicer and Lundstedt 1976; Vogel 1974) caused taxpayers to respond to a tax-rate change in a 
manner opposite that predicted by the economic model of tax reporting.? If the level of public 


! The model assumes that the penalty for underreporting is determined by multiplying the penalty rate by the unpaid tax, 
which is consistent with the United States tax system. Given this structure, the penalty increases proportionately with 
the increase in the tax rate. 

2 These refinements all assume expected utility maximization. To the degree that taxpayers’ preferences reflect nonmonetary 
factors such as equity considerations, moral beliefs, peer influence, etc. Jackson and Milliron 1986; Webley et al. 1991), 
none of the current economic models may adequately characterize taxpayers’ actual reporting behavior. 

3 Vertical equity is a third type of equity and refers to the relative treatment of taxpayers with different levels of income 
(Dean et al. 1980). Vertical equity is often described in terms of tax-rate progressivity, i.e., the relative treatment of more 

(Continued on page 622) 
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goods and services remains constant as the tax rate changes, the taxpayers’ exchange with the 
government changes. Because subjects in the previous experimental tax-rate change studies 
received no public goods (i.e., the level of public goods was held constant at zero), exchange 
inequity effectively increased (decreased) as the tax rate increased (decreased). Thus, a tax-rate 
change would result in taxpayers reporting less (more) income when tax rates increased 
(decreased) if the effect of the change in exchange inequity dominated the effect of the economic 
incentives.* 

We hypothesize that the magnitude of the effect of a change in exchange inequity on reported 
income depends on taxpayers’ perceptions of horizontal equity or inequity. Specifically, we 
expect that the effect of exchange inequity (resulting from a tax-rate increase) will dominate the 
effect of economic incentives when taxpayers also perceive that they are inequitably treated 
relative to other taxpayers. The survey literature suggests that such perceptions are common, as 
many taxpayers believe they pay too much in taxes relative to other taxpayers with the same level 
of income (e.g., Dean et al. 1980). On the other hand, we expect that the effect of exchange 
inequity will be reduced when taxpayers perceive that all taxpayers with equal incomes face the 
same tax increase. No previous study has examined how taxpayers respond to a tax-rate change 
under different horizontal equity conditions.’ 

Our focus on perceived inequity also suggests that Beck et al. (1991) may have found results 
consistent with the predictions of the economic model because their experimental setting differed 
from the settings used in other studies in an Important way. Beck et al.’s experimental setting was 
abstract: their subjects did not know they were operating in a tax setting and thus would not be 
expected to retrieve their mental tax-payment scripts (Alm, McClelland, and Schulze 1992).$ 
Taxpayers' mental scripts typically include perceptions of both horizontal inequity (Dean et al. 
1980) and exchange inequity (Spicer and Lundstedt 1976; Vogel 1974; Yankelovich et al. 1984).7 
If Beck et al.'s subjects did not view their task as a tax-reporting decision, it is unlikely that either 
horizontal or exchange equity considerations entered into their decisions and only the economic 


(Footnote 3 continued) 
wealthy taxpayers versus less wealthy taxpayers (e.g., Lewis 1978; Roberts and Hite 1992; Roberts et al. 1992). This issue 
does not arise in our experiment because we hold income constant across all experimental conditions. 

4 An alternative way to vary exchange inequity is to vary the amount of public goods that subjects receive while holding 
the tax rate constant. Becker et al. (1987) explicitly manipulated exchange inequity in this manner. In addition, while 
not explicitly varying exchange inequity, two other studies have varied the Ievel of public goods subjects received (Alm, 
McClelland, and Schulze, 1992; Alm, Jackson, and McKee, 1992). Unfortunately, because of the experimental designs 
used, the economic models in these studies yield ambiguous predictions concerning the effect on reported income of 
changing the level of public goods, and hence the experimental results cannot distinguish between the economic model' s 
prediction and the exchange inequity prediction. However, consistent with perceptions of exchange inequity playing 
arole in taxpayers’ reporting decisions, these public goods studies consistently find that taxpayers report more income 
when they receive public goods than when they do not. 

5 There is limited and mixed experimental evidence regarding the effect of horizontal equity alone on tax reporting. Spicer 
and Becker (1980) found that taxpayers facing horizontal inequity (i.e., their tax rate was higher than other taxpayers’ tax 
rates) reported less income than did taxpayers facing horizontal equity (i.e., their tax rate was the same as other taxpayers). 
Webley et al. (1988) found no statistically significant relation between horizontal equity and reported income. 

6 Beck et al. (1991, fn 6) report that only 14 percent of their subjects believed that the experimental task dealt with tax 
compliance. Beck et al chose an abstract experimental setting because subjects’ behavior might be sensitive to references 
to “real world” phenomena (see also Davis and Swenson 1988, 49). Abstract settings are generally preferred for tests of 
economic theories because such settings provide better control over subjects’ preferences. However, use of abstract 
in tax experiments ignores evidence that context is important in explaining tax evasion behavior (Alm 1991, 589). 

? Regarding Ea chin ee ien inequity, Yankelovich et al. (1984) interviewed 2200 U.S. taxpayers and found that 73 percent of 
taxpayers with the statement that “My income taxes are too high for what I get from the federal government” 
(139, table V-10). Regarding horizontal equity, Dean et al. (1980) surveyed 424 taxpayers in Scotland and report that 
approximately 60 percent of taxpayers believe they pay “too m " relative to other taxpayers with the same level of 

income (37, figure 1). 
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incentives associated with the tax-rate change were present to influence subjects’ reporting 
behavior. 

The explanation offered above assumes that subjects’ behavior in the previous tax-context 
experiments was influenced by their mental tax-payment scripts which included perceptions of 
both horizontal and exchange inequity. In the present study, rather than rely on assumptions 
regarding the contents of subjects’ tax-payment scripts, we directly manipulate both horizontal 
and exchange inequity. Alm, McClelland, and Schulze (1992) have shown that when the relevant 
information is provided, subjects use it rather that relying on the content of their mental scripts. 
We examine whether perceptions of horizontal and exchange inequity interact to influence 
taxpayers’ reporting behavior, and whether the joint effects of such equity perceptions can help 
to reconcile the previous inconsistent empirical results. Scott and Grasmick (1981) have 
demonstrated the importance of testing for such interactions between factors that affect tax 
reporting decisions by showing that ignoring such interactions can lead to inappropriate 
conclusions regarding the separate effects of individual factors on taxpayers’ reporting decisions. 

In summary, we hypothesize that taxpayers’ responses to a change in exchange inequity 
resulting from a tax-rate change will depend on whether they are treated inequitably relative to 
other taxpayers (horizontal inequity) or equitably relative to other taxpayers (horizontal equity). 
Thus, our hypothesis predicts an interaction between taxpayers’ responses to a change in 
exchange equity resulting from a tax-rate change and their perceptions of horizontal equity: 


Hypothesis: In the presence of horizontal inequity, taxpayers will report less (more) income 
when the tax rate increases (decreases) and the level of public goods remains constant. 
However, in the presence of horizontal equity, taxpayers will not report less (more) 
income when the tax rate increases (decreases) and the level of public goods remains 
constant. 


IV. EXPERIMENT 

Design 

To test our hypothesis, we designed an experiment in which both Horizontal Equity (Inequity 
vs. Equity) and Exchange Inequity (Increasing vs. Decreasing) were manipulated. The amount 
of public goods was held constant (at zero) to ensure that increasing (decreasing) tax rates resulted 
in increasing (decreasing) exchange inequity and to maintain consistency with the previous tax- 
rate change experiments. The experimental design is depicted in the top portion of figure 1. 

As shown in the top portion of figure 1, Horizontal Equity is a between-subjects factor with 
two levels: Inequity and Equity. Subjects in the Horizontal Inequity condition were informed at 
the start of each period that the tax rate they faced that period was higher than the tax rate faced 
by some other taxpayers.? Subjects in the Horizontal Equity condition were informed that the tax 
rate they faced was the same as the tax rate for all other taxpayers. 

Exchange Inequity is a between-subjects factor with two levels: Increasing and Decreasing. 
In the Increasing Exchange Inequity condition, the tax rate increased from 20 percent in periods 
1-3, to 30 percent in periods 4—6, and to 45 percent in periods 7—9, and, because public goods 


* As shown in figure 1, when the subjects faced a 20 percent rate, they were told that some other taxpayers faced a seven 
percent rate, when the subjects faced a 30 percent rate, they were told that some other taxpayers faced a ten percent rate, 
and, when the subjects faced a 45 percent rate, they were told that some other taxpayers faced a 15 percent rate. Thus, 
as measured by the 3 to 1 ratio of the tax rates, the degree of borizontal inequity was held approximately constant across 
the nine periods of the experiment. 
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FIGURE 1 
Experimental Design 
Horizontal Exchange 
Equity Inequity 
Condition Condition n Periods 1-3 Periods 4-6 Periods 7-9 
Increasing 23 20%* 30% 45% 
7%)** 10% 15% 
Horizonal Sal e? 
Inequity Decreasing 25 45% 30% 20% 
(15%) (10%) (7%) 
Increasing 24 20% 30% 45% 
Horizonal (20%) (30%) (45%) 
Egui a aa OL O XU UOOPUUM eus 
qut Decreasing 25 4596 30% 20% 
(45%) (30%) (20%) 
Baseline 
Condition Constant 42. 2096 20% 20% 


*Indicates the tax rate faced by subjects in the corresponding cell. 
**Subjects were informed that some other taxpayers faced the tax rate in parentheses. 


remained constant, exchange inequity increased accordingly. In the Decreasing Exchange 
Inequity condition, this pattern was reversed and exchange inequity decreased accordingly. 

Another factor called Periods is included in the design to operationalize the Increasing and 
Decreasing Exchange Equity conditions. As such, it is a within-subject factor with three levels: 
periods 1—3, periods 4—6, and periods 7—9. Within each set of three periods, the tax rate was held 
constant. In addition to the manipulated factors described above, other factors that influence 
reporting decisions, i.e., level of income, level of public goods, probability of audit and amount 
of penalty for underreporting, were controlled by holding them constant across all experimental 
cells. 

We also collected data for a baseline condition (depicted in the bottom portion of figure 1) 
in which the tax rate remained constant at 20 percent for all nine periods, so that there was no 
change in exchange inequity. To the extent that repeated experience with the task affects subjects’ 
responses across the experimental periods, the data from this baseline condition provide an 
estimate of this “experience effect.” This baseline estimate is then used to adjust the data in the 
experimental cells depicted in the top portion of figure 1, in order to remove the effect of 
experience from the experimental data. 


Procedures 


One hundred and thirty-nine MBA students each participated in one of four experimental 
sessions. Each session consisted of nine periods and lasted about 90 minutes. All sessions were 
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conducted on the same day and approximately one-fourth of the subjects participated in each 
session. In each of the four sessions, an approximately equal number of subjects was randomly 
assigned to each of the four between-subjects experimental cells depicted in the top portion of 
figure 1 and to the Baseline condition cells depicted in the bottom of figure 1. Thus, all ceils 
depicted in figure 1 include observations irom each of the four experimental sessions. 

An experimental currency called Lira was used, and at the end of the experiment subjects 
were paid in cash at a rate of $1.00 (U.S.) per 1,000 Lira. The amount each subject was paid 
depended on which one of the nine periods was randomly selected for payment at the end of the 
experiment, the amount of income the subject reported in the selected period and whether that 
subject had been audited in the selected period (details provided below). Because each period had 
an equal probability of being selected for payment, the payment scheme should produce the same 
reporting incentive in each period. 

At the start of the experiment, the instructions were read aloud and a practice period was 
conducted. Each of the nine experimental periods began with subjects reading their information 
for that period. This information included the subject’s tax rate for that period as well as the tax 
rate of some other hypothetical taxpayers. Both of these tax rates are shown in each experimental 
cell in figure 1. Subjects did not know the tax rates of any other subjects in any other cells in 
figure 1. 

The instructions clearly specified that (1) all taxpayers had the same income of 10,000 Lira 
in every period, (2) the probability of audit was 25 percent in each period and did not change when 
a subject was previously audited and (3) the penalty rate was 150 percent of the tax evaded. This 
information was repeated as part of the written information each subject received at the start of 
each of the nine periods. At the beginning of each period, one of the experimenters reminded the 
subjects to read carefully all information provided to them for that period. After subjects read all 
their information for the period, they decided how much income to report. 

Any underreporting of income by a subject was detected only if that subject was selected for 
an audit at the end of the period. At the end of each period, subjects were selected for an audit using 
a random procedure in which ten chips were drawn from a container holding 40 chips (with each 
participant's number on one chip, plus enough other chips to total 40 chips). Thus, each subject 
faced a 25 percent probability of audit. The procedure was conducted in full view of all subjects. 
Because selection was based on participant number, and because each participant knew his or her 
own number but not the numbers of other participants, only subjects selected for an audit knew 
that they were selected and whether they had underreported.? In each period, after the audit 
procedure was completed, subjects calculated the amount of payment they would receive if, at the 
end of the experiment, that period were the one randomly selected to determine their cash 
payment. 

Each subject was provided with an initial endowment of 2,000 Lira to ensure that no subject 
could have a negative balance at the end of the experiment. The amount a subject would be paid 
for any period (if, at the end of the experiment, that period were selected for payment) equaled 
the initial endowment of 2,000 Lira, plus income of 10,000 Lira, minus the tax and any penalty 
the subject would have to pay if the subject underreported and was audited in that period.'? 


? [n order to make the threat of an audit credible, it was necessary that underreporting be detected if and only if a subject 
were selected for an audit. Consequently, subjects were informed that when they were paid at the end of the experiment, 
the experimenters would be able to determine for certain which subjects had been audited in the payment period and 
Eon e M d Subjects also were informed that if they were not selected for an audit, 

would go undetected and, as such, could not affect their payment. 

" To illustrate the payment scheme, assume that period 6 was randomly selected for payment. Consider a subject in the 
Increasing Exchange Inequity condition and either the Horizontal Inequity or Horizontal Equity condition. This subject 

(Continued on page 626) 
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Atthe end of the experiment, subjects completed a post-experimental questionnaire and a risk 
preference measurement instrument similar to that used by Murnighan et al. (1988). The post- 
experimental questionnaire contained manipulation checks for the Horizontal Equity and 
Exchange Inequity manipulations. Risk preferences were measured to test whether, as assumed 
in the economic model, our subject pool was risk averse and whether there were any significant 
differences in risk preferences across experimental conditions. 


V. RESULTS 


Before testing our hypothesis, we examined the baseline data for evidence of an effect of 
experience with the task. Table 1 reports Mean Reported Income amounts averaged across 
periods 1—3, periods 4—6 and periods 7—9, for each experimental cell and for the baseline 
condition. In the baseline condition, Mean Reported Income declined from periods 1—3 (8198) 
to periods 4—6 (7925), and again to periods 7—9 (7358) even though the tax rate remained constant 
at 20 percent across these periods. We use these amounts to estimate the "experience effect," and 
then to adjust the Mean Reported Income amounts in the top portion of table 1 accordingly." 
Table 2 reports the resulting Adjusted Mean Reported Income amounts. Using these adjusted data 
to test our hypothesis simplifies the presentation of the results by eliminating the need to discuss 
the extraneous effect of experience. Further, the adjustment process does not change any of the 
statistical results of interest or our interpretation of these results (see footnote 16). 


Manipulation checks 


Ninety-four percent of the subjects (130 of 139) correctly indicated whether they were or 
were not provided with information about other taxpayers' tax rates during the experiment. In 
addition, subjects rated the degree to which they felt they were fairly or unfairly treated "relative 
to other taxpayers" on an eleven-point scale with endpoints labeled “very unfairly” (—5) and “very 
fairly" (+5), and the midpoint labeled neither “fair nor unfair" (0). An ANOVA with Horizontal 
Equity and Exchange Inequity as between-subjects factors and subjects’ ratings of their perceived 
horizontal equity as the dependent variable shows that the ratings of subjects in the Horizontal 
Equity condition (mean = +2.49) are significantly higher (F= 148, p<.001) than the ratings in the 


(Footnote 10 continued) 
would have faced a 30 percent tax-rate in period 6 (see figure 1). Assume that the subject reported 8,000 Lira and was not 
selected for an audit in period 6. This subject’s payment would be (2,000 Lira + 10,000 Lira) minus tax of .30 (8,000 
Lira) = 9,600 Lira. Converting the Lira to cash (9,600 x .001), the subject would be paid $9.60. However, if the subject had 
reported 8,000 Lira and had been randomly selected for an audit in period 6, the subject's payment would also have been 
reduced by the unpaid tax of $.60 = 600 Lira (2,000 unreported income x .30) plus a penalty of $.90 = 900 Lira (1.5 x (2,000 
unreported income x .30)), and thus, the subject would have been paid $8.10. Actual earnings per subject averaged $9.60, 
ranging from the minimum possible ($0.75) to the maximum possible ($12.00). 

! Before making the adjustment, we conducted an ANOVA that confirmed that experience had a statistically significant 
effect on Mean Reported Income in the baseline condition. The results also indicated that neither the main effect of 
Horizontal Equity (p>.27) nor the interaction between Horizontal Equity and Periods (p>.28) was significant. Because 
there was no significant difference between the Horizontal Equity condition (n=21) and the Horizontal Inequity 
condition (n=21) in the baseline condition, the data were combined as shown in table 1 (n=42). Based on these data, the 
estimated effect of experience for periods 1—3 versus periods 4—6 was calculated as 8,198 —7,925 =+273, and for periods 
1-3 versus periods 7—9 as 8,198 — 7,358 = +840. These estimated effects of experience were then used to adjust the 
reported income amounts for all subjects facing the Increasing and Decreasing Exchange Inequity condition. That is, 
for every subject in the Increasing or Decreasing Exchange Inequity conditions, Adjusted Mean Reported Income for 
periods 1—3 simply equals the subject's Mean Reported Income for periods 1-3; i.e., no adjustment was made because 
these three periods were treated as the base. However, for periods 4—6, Adjusted Mean Reported Income equals Mean 
Reported Income for periods 4—6 plus the adjustment of +273, and for periods 7—9, Adjusted Mean Reported Income 
equals Mean Reported Income for periods 7-9 plus the adjustment of +840. 
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TABLE 1 


Mean Reported Income Amounts by Horizontal Equity Condition, 
Exchange Inequity Condition and Periods 








(in Lira) 
Horizontal Exchange 
Equity Inequity Periods Periods Periods Marginal 
Condition Condition* 1-3 4-6 7-9 Mean 
Horizontal Increasing (n=23) 8123 6920 6072 7039 
Inequity: Decreasing (n=25) 7247 7367 7221 7279 
Marginal Mean (n=48) 7661 7153 6670 7163 
Horizontal Increasing (n=24) 7521 7549 7333 7468 
Equity: Decreasing (n=25) 7340 6773 6647 6920 
Marginal Mean (n=49) 7429 7153 6983 7188 
Overall Marginal Mean (n=97) 7541 7153 6829 7176 
Baseline 
Condition Constant (n=42) 8198 7925 7358 7827 


*Increasing Exchange Inequity is operationalized by increasing the tax rate from 20 percent in periods 1—3 to 30 percent 
in periods 4—6 and to 45 percent in periods 7—9. Decreasing Exchange Inequity is operationalized by decreasing the tax 
rate from 45 percent in periods 1—3 to 30 percent in periods 4—6 and to 20 percent in periods 7—9. 


Horizontal Inequity condition (mean =—2.55), indicating that Horizontal Equity was successfully 
manipulated. Neither Exchange Inequity nor the interaction between Horizontal Equity and 
Exchange Inequity was significant (Fs« 1), indicating that the Horizontal Equity manipulation 
was equally successful in both Exchange Inequity conditionis. 

The Exchange Inequity manipulation was also successful. Ninety-eight percent of the 
subjects (136 of 139) correctly indicated whether their tax rates increased, decreased or remained 
constant during the experiment. Moreover, the fact that subjects calculated their potential 
payment (which included no public goods) after each of the nine periods, reminded the subjects 
in each period that public goods were held constant at zero. Thus, because subjects in the 
Increasing and Decreasing Exchange Inequity conditions knew that their tax rates changed and 
the direction of the change, and that the level of public goods was constant at zero, they knew that 
their exchange inequity increased (decreased) as the tax rate increased (decreased). 


Hypothesis Tests 


Our hypothesis predicts that, in the Horizontal Inequity condition, reported income will 
decrease (increase) as Exchange Inequity increases (decreases), whereas in the Horizontal Equity 
condition, no such effects will occur. In our design, this means that a significant 2-way interaction 
between Exchange Inequity (Increasing and Decreasing) and Periods (1—3, 4-6 and 7—9) is 


12 The manipulation check provides an indirect measure of exchange inequity. More direct evidence would have required 
that we use an additional question in the post-experimental questionnaire to measure whether and how subjects’ 
exchange inequity perceptions changed as the tax rate changed. 
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TABLE 2 


Adjusted Mean Reported Income Amounts by Horizontal Equity Condition, 
Exchange Inequity Condition and Periods 


(in Lira) 
Horizontal Exchange 
Equity Inequity Periods Periods Periods Marginal 
Condition Condition* 1-3 4—6 7-9 Mean 
Horizontal Increasing (n=23) 8123 7193 6912 7409 
Inequity: Decreasing (n=25) 7247 7640 8061 7649 
Marginal Mean (1-48) 7667 7426 7510 7534 
Horizontal Increasing (n=24) 7521 7822 8173 7839 
Equity: Decreasing (n=25) 7340 7046 7487 7291 
Marginal Mean (n=49) 7429 7426 7823 7559 
Overall Marginal Mean (n=97) 7547 7426 7669 7547 


*Increasing Exchange Inequity is operationalized by increasing the tax rate from 20 percent in periods 1—3 to 30 percent 
in periods 4—6 and to 45 percent in periods 7-9. Decreasing Exchange Inequity is operationalized by decreasing the tax 
rate from 45 percent in periods 1—3 to 30 percent in periods 4—6 and to 20 percent in periods 7-9. 


predicted in the Horizontal Inequity condition, while no such interaction is predicted in the 
Horizontal Equity condition. The interaction is predicted in the Horizontal Inequity condition 
because in the Increasing Exchange Inequity condition, reported income is predicted to decrease 
as the tax rate increases from periods 1—3 (20 percent) to periods 4—6 (30 percent) and again to 
periods 7—9 (45 percent), while in the Decreasing Exchange Inequity condition, reported income 
is predicted to increase as the tax rate decreases from periods 1—3 (45 percent) to periods 4—6 (30 
percent) and again to periods 7—9 (20 percent). No interaction is predicted in the Horizontal Equity 
condition because reported income is not predicted to change significantly as the tax rate increases 
or decreases. 

Because a 2-way interaction between Exchange Inequity and Periods is predicted in the 
Horizontal Inequity condition, but not in the Horizontal Equity condition, a significant 3-way 
interaction is predicted between Horizontal Equity, Exchange Inequity and Periods when all data 
from both the Horizontal Equity and Inequity conditions (summarized in table 2) are included in 
the analysis. An ANOVA was conducted to test for this 3-way interaction. The dependent variable 
was Adjusted Mean Reported Income as summarized in table 2. Each subject' s adjusted reported 
income was averaged across periods 1—3, 4—6 and 7—9 to arrive at an Adjusted Mean Reported 
Income amount for each set of three periods in which the tax rate was held constant. There were 
three independent variables: Horizontal Equity (between-subjects factor), Exchange Inequity 
(between-subjects factor) and Periods (within-subject factor). Consistent with our hypothesis, the 
3-way interaction is significant (p=.023) and no other factors or interactions are significant. This 
result indicates that the amount of income subjects reported as the tax rate changed differed across 
the Horizontal Inequity and Horizontal Equity conditions, and provides initial support for our 
hypothesis. In addition, this significant 3-way interaction makes it appropriate to perform 
separate ANOVAs for the Horizontal Inequity condition (panel A of table 3) and the Horizontal 
Equity condition (panel B of table 3) to further test our hypothesis. The corresponding data are 
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TABLE 3 
Analyses of Variance for Adjusted Mean Reported Income” 


Panel A: Horizontal Inequity Condition: ANOVA Results of Adjusted Mean Reported Income (n=48Y 





F p-Value 
Factors: 
Exchange Inequity .09 161 
Periods .34 711 
Interaction: 
Exchange Inequity x 
Periods 4.67 O12” 


Panel B: Horizontal Equity Condition: ANOVA Results of Adjusted Mean Reported Income (n=49)* 


F p-Value 
Factors: 
Exchange Inequity .46 .503 
Periods .93 .399 
Interaction: 
Exchange Inequity x 
Periods .45 .637 


* Adjusted Mean Reported Income equals each subject's reported income adjusted for experience as described in footnote 
11. For each subject, these adjusted reported income amounts have been averaged over periods 1-3, periods 4-6 and 
periods 7-9 to obtain the Adjusted Mean Reported Income. 

“As predicted, this interaction is statistically significant and explains 5 percent of total variance (a = .05) and all other 
effects are not significant and explain none of the total variance (a = 0). 





presented graphically in figures 2A (Horizontal Inequity condition) and 2B (Horizontal Equity 
condition). 

In the Horizontal Inequity condition, (panel A of table 3), we find the predicted significant 
interaction between Exchange Inequity and Periods (p=.012). Moreover, figure 2A shows that all 
changes in reported income are directionally consistent with the predicted pattern, i.e., reported 
income decreased (increased) as the tax rate increased (decreased). The results for the 
Horizontal Equity condition (panel B of table 3) are also consistent with our predictions. Here 
there is no significant interaction between Exchange Inequity and Periods (F«1) indicating that 
when taxpayers know they are equitably treated relative to other taxpayers, they do not 


? We also examined the simple effects of Periods for the two Exchange Inequity conditions. The decrease in Adjusted 
Mean Reported Income was marginally significant when the tax rate increased (F=2.64, p<.08). When the tax rate 
decreased, the increase in Adjusted Mean Reported Income was directionally consistent with our hypothesis but did not 
reach significance at conventional levels (F-2.02, p.14). Paired t-tests indicated that, in the Increasing Exchange 
Inequity condition, reported income decreased significantly (p<.02, one-tailed t-test) between the 20 percent tax rate 
(mean = 8123) and the 30 percent tax rate (mean = 7193), and decreased significantly (p<.04, one-tailed t-test) between 
the 20 percent tax rate (mean = 8123) and the 45 percent tax rate (mean = 6912). The decrease in reported income between 
the 30 percent tax rate (mean = 7193) and the 45 percent tax rate (mean = 6912) was not statistically significant. The 
pattern of results for the Decreasing Exchange Inequity candition paralleled that of the Increasing Exchange Inequity 
condition. 
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FIGURE 2 


Adjusted Mean Reported Income for the Horizontal Inequity and 
Horizontal Equity Conditions 
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significantly change the amount of income they report as the tax rate changes. Thus, the results 
of the separate analyses of the Horizontal Inequity and the Horizontal Equity conditions provide 
further support for our hypothesis." 


Further analyses 


The risk preference data provide assurance that our findings are not due to subjects’ risk 
preferences. The results indicate that: (1) consistent with the assumptions of the economic model, 
subjects were predominantly risk averse, and (2) any differences in reported income across the 
Horizontal Inequity and Horizontal Equity conditions cannot be attributed to systematic differ- 
ences in subjects’ risk preferences across the two conditions. Overall, 65 percent of the subjects 
were risk averse, 26 percent were risk neutral and nine percent were risk seeking. In the Horizontal 
Inequity condition, corresponding percentages were 69 percent, 22 percent and nine percent, 
whereas in the Horizontal Equity condition, corresponding percentages were 60 percent, 31 
percent, and nine percent. 

To ensure that our statistical results do not depend on the adjustments to reported income 
amounts, we repeated the statistical analyses reported in the paper using unadjusted reported 
income amounts. As indicated earlier, the results for the unadjusted reported income amounts are 
the same as the results for the adjusted data.!6 We also reanalyzed the data using two alternative 
adjustment approaches, and again the results are the same as those reported." Finally, to ensure 
that our results do not depend on the fact that we included 21 completely honest subjects in the 
analysis, we repeated the analysis excluding these subjects and again the results are the same as 
those reported. 


4 We also calculated the number and percent of taxpayers who underreported (i.e., reported less than 10,000 Lira for at 
leastone of the three periods) for the two Horizontal Equity conditions as Exchange Inequity changed. In the Horizontal 
Inequity condition, there is a systematic pattern in which the percent of subjects who underreported increased cach time 
Exchange Inequity increased (61 percent for the 20 percent tax rate, 74 percent for the 30 percent tax rate and 78 percent 
for the 45 percent tax rate) and decreased each time Exchange Inequity decreased (68 percent for the 45 percent tax rate, 
56 percent for the 30 percent tax rate and 48 percent for the 20 percent tax rate). In contrast, in the Horizontal Equity 
condition, the percent of taxpayers who evaded remained relatively constant as Exchange Inequity increased or 

8 Subjects chose either a sure thing or a lottery which paid either 2000 Lira or 0. The lottery probabilities of winning 2000 
decreased in increments of .05 from a probability of .85 to a probability of .15. This produced a total of 15 choices 
between 1000 for certain (the sure thing) and the lottery which paid either 2000 or 0. The 15 lotteries were listed in 
decreasing order of their expected value. Lotteries 1-7 were favorable (expected value greater than 1000), lottery 8 was 
actuarially neutral (expected value equal to 1000) and lotteries 9-15 were unfavorable (expected value less than 1000). 
Subjects were classified as risk averse if they selected the lottery in choice 1, but switched to the sure thing at choice 
8 and continued to choose the sure thing thereafter. Subjects were classified as risk seeking if they selected the lottery 
in choice 1, but switched to tbe sure thing for the first time at choice 9 or later. 

i6 An ANOVA with the unadjusted Mean Reported Income data (top portion of table 1, n-97) as the dependent variable 
and Periods, Exchange Inequity and Horizontal Equity as independent variables produced a significant three-way 
interaction (reflecting the fact that there was a significant two-way interaction between Exchange Inequity and Periods 
in the Horizontal Inequity condition but not in the Harizontal Equity condition). In addition, there was asignificant effect 
of Periods (reflecting the experience effect in the unadjusted data). There were no other statistically significant variables 
or interactions. These results are identical to those reported in the paper for the comparable ANOV A for the adjusted 
data, except that with the adjusted data, Periods (experience) is no longer significant because the adjustment removed 
this effect. The results of ANOVAs comparable to those reported in panels A and B of table 3 for the Horizontal Inequity 
Horizontal Equity conditions using the unadjusted data also yield results equivalent to those reported for the adjusted 

17 The two alternative approaches were: (1) using the median reported income amounts rather than the mean to adjust the 
data, and (2) using separate means for the Horizontal Inequity and Horizontal Equity conditions (versus the combined 
mean) to adjust the data. 
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VI. DISCUSSION 


The results of this study provide evidence that, in addition to the usual economic incentives, 
subjects’ responses to tax-rate changes are affected by their perceptions of both horizontal and 
exchange inequity. Subjects reported less (more) income as tax rates increased (decreased) when 
they were inequitably treated relative to others, but not when they were equitably treated relative 
to others. 

These findings have several implications. First, our results help to explain the puzzle of why 
only one of the many previous tax-rate change studies, Beck et al. (1991), found support for the 
predictions of the economic model. Apparently, by using an abstract experimental setting, Beck 
et al. prevented perceptions of horizontal and exchange inequity from affecting subjects’ 
reporting decisions. Beck et al.’s study is important in that it shows in the absence of perceptions 
of inequity, the economic model has predictive power. However, given the pervasive finding in 
both experimental and field studies that taxpayers behave in a manner opposite that predicted by 
the standard economic model of tax reporting, it appears that to explain taxpayer behavior in more 
realistic settings the analysis may need to incorporate other factors such as perceptions of 
inequity. 

Second, our findings suggest the need to reconsider several standard conclusions drawn from 
the prior empirical research. Alm (1991, 584) states that prior experimental results demonstrate 
that, “Individuals report less income as the tax rate rises.” Our results suggest that while this 
conclusion may hold when taxpayers feel inequitably treated relative to other taxpayers, it may 
not hold when taxpayers feel equitably treated relative to other taxpayers. Consistent with Scott 
and Grasmick’s (1981) arguments, this finding demonstrates the importance of examining the 
interaction of factors that influence tax reporting behavior. 

A second standard conclusion that may need to be reevaluated is that taxpayers generally 
report less income when they feel that they are treated inequitably relative to other taxpayers (Alm 
1991, 584). We find no such main effect of our horizontal equity manipulation (p=.95) despite 
evidence that our manipulation was effective.'* Rather than a main effect of horizontal equity, we 
find that horizontal equity interacts with the changes in exchange inequity resulting from tax-rate | 
changes to affect taxpayers’ reporting decisions. 

Third, our results suggest that the results of tax experiments and field studies are likely to 
depend on taxpayers’ perceptions of equity. Thus, it is important that taxpayers’ equity 
perceptions be measured or otherwise controlled when designing experiments or interpreting 
field data. 

Finally, the equity effects found in this study may be important in other accounting settings. 
For example, the cost to corporate headquarters (in terms of distortions of otherwise optimal risk- 
sharing arrangements) of obtaining truthful reports from unit managers may depend on whether 
such managers perceive that the information is being used to treat them equitably relative to other 
units. To the extent that unit managers perceive that they are being inequitably treated (e.g., 
bearing more corporate overhead cost than other comparable units or paying unreasonably high 
transfer prices), they may be more inclined to misrepresent their own units’ financial conditions. 
While our results provide no direct evidence concerning such corporate environments, the 


18 The reported p-value of .95 is from the ANOVA using the adjusted data summarized in table 2 to test for the predicted 
3-way interaction. The main effect of Horizontal Equity is also nonsignificant in the comparable ANOVAs using the 
unadjusted data, the alternative adjustment approaches or the adjusted data excluding honest taxpayers. This finding of 
no main effect of horizontal equity is consistent with results reported by Webley et al. (1988), but inconsistent with 
results reported by Spicer and Becker (1980). 
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similarities between such settings and the tax-reporting environment suggest that consideration 
of equity perceptions in corporate environments is appropriate. 

A limitation of this study is that the audit probability and penalty rates were higher than those 
in most actual tax settings. Although the parameters used were similar to those used in previous 
experimental studies, it remains an open question whether our results generalize to settings with 
more realistic parameters. A related feature of our study, and indeed of all experimental tax 
studies, concerns how well the monetary penalties used in tax experiments capture all aspects of 
the real-world penalty structure such as the threat of incarceration or the embarrassment 
associated with tax evasion. Further, as in the previous experimental studies, the IRS's audit 
strategy was exogenously fixed. Although this may be an appropriate assumption (see Beck et al. 
1991, fn 3), the question as to how taxpayers would respond to tax-rate changes if they faced a 
strategic IRS remains unanswered. Similarly, because our experiment involved the reporting of 
income which the IRS could verify only through an audit, it is not clear to what extent our results 
generalize to income reported independently to the IRS by employers. Finally, this study did not 
examine how taxpayers respond to tax-rate changes as the amount of public goods or services 
varies. This issue is addressed by Kim et al. (1995) who extend the present study by manipulating 
public goods in order to examine the effects of exchange inequity on tax reporting behavior. 


REFERENCES 

Adams, J. 1965. Inequity in social exchange. In Advances in Experimental Social Psychology, edited by L. 
Berkowitz. New York, NY: Academic Press. 

Allingham, M., and A. Sandmo. 1972. Income tax evasion: A theoretical analysis. Journal of Public 
Economics 1: 323—338. 

Alm, J. 1991. A perspective on the experimental analysis of taxpayer reporting. The Accounting Review 66 
(July): 577—593. 

. 1988. Uncertain tax policies, individual behavior, and welfare. The American Economic Review 

78 (March): 237—245. 

, G. McClelland, and W. Schulze. 1992. Why do people pay taxes? Journal of Public Economics 

48 (June): 21-38. 

; B. Jackson, and M. McKee. 1992. Institutional uncertainty and taxpayer compliance. American 
Economic Review 82 (September): 1018—1026. 

Baldry, J. 1987. Income tax evasion and the tax schedule: Some experimental results. Public Finance 42: 
357—383. 

Beck, P., J. Davis, and W. Jung. 1991. Experimental evidence on taxpayer reporting under uncertainty. The 
Accounting Review 66 (July): 535—558. 

Becker, W., H. Buchner, and S. Sleeking. 1987. The impact of public transfer expenditures on tax evasion. 
Journal of Public Economics 34 (November): 243—252. 

Benjamini, Y., and S. Maital. 1985. Optimal tax evasion and optimal tax evasion policy: Behavioral aspects. 
In The Economics of the Shadow Economy, edited by A. Wenig and W. Gürtner. Berlin and New York, 
NY: Springer Verlag. 

Clotfelter, C. 1983. Tax evasion and tax rates: An analysis of individual returns. Review of Economics and 
Statistics 65 (August): 363-373. 

Collins, J., and D. Plumlee. 1991. The taxpayer's labor and reporting decision: The effect of audit schemes. 
The Accounting Review 66 (July): 559—576. 

Cowell, F., and J. Gordon. 1988. Unwillingness to pay. Journal of Public Economics 36 (3) (August): 305— 
321. 

Crane, S., and F. Nourzad. 1986. Inflation and tax evasion: An empirical analysis. The Review of Economics 
and Statistics 48 (2) (May): 217—223. 

Davis, J., and C. Swenson. 1988. The role of experimental economics in tax policy research. The Journal 
of the American Taxation Association 10: 40—59. 











634 The Accounting Review, October 1995 


Dean, P., T. Keenan, and F. Kenney. 1980. Taxpayers’ attitudes to income tax evasion: An empirical study. 
British Tax Review 25: 28—44. 

Friedland, N., S. Maital, and A. Rutenberg. 1978. A simulation study of tax evasion. Journal of Public 
Economics 8: 107—116. 

Graetz, M., J. Reinganum, and L. Wilde. 1986. The tax compliance game: Toward an interactive theory of 
law enforcement. Journal of Law, Economics, and Organization 2 (Spring): 1—32. 

Jackson, B., and V. Milliron. 1986. Tax compliance research: Findings, problems, and prospects. Journal 
of Accounting Literature 5: 125—165. 

Kim, C., J. Evans III, and D. Moser 1995. The effect of public transfers and tax-rate changes on reported 
income. Working paper, University of Pittsburgh, Pittsburgh, PA. 

Lewis, A. 1982. The Psychology of Taxation. New York, NY: St. Martin's Press. 

. 1978. Perceptions of tax rates. British Tax Review 23: 358—366. 

Milgrom, P., and J. Roberts. 1992. Economics, Organizations & Management. Englewood Cliffs, NJ: 
Prentice Hall. — 

Murnighan, K., A. Roth, and F. Schoumaker. 1988. Risk aversion in bargaining: An experimental study. 
Journal of Risk and Uncertainty 1 (March): 101—124. 

Pencavel, J. 1979. A note on income tax evasion, labor supply, and nonlinear tax schedules. Journal of Public 
Economics 12 (August): 115-124. 

Poterba, J. 1987. Tax evasion and capital gains taxation. American Economic Review 77(2) (May): 234—239. 

Roberts, M., and P. Hite. 1992. Who should pay, and how much? Working paper, University of Alabama, 
Tuscaloosa, AL. 

———, , and C. Bradley. 1992. Understanding progressivity: An experimental investigation using 
abstract and concrete framing. Working paper, University of Alabama, Tuscaloosa, AL. 

Scott, W., and H. Grasmick. 1981. Deterrence and income tax cheating: Testing interaction hypotheses in 
utilitarian theories. Journal of Applied Behavioral Science 17:.395-408. 

Spicer, M., and L. Becker. 1980. Fiscal inequity and tax evasion: Àn experimental approach. National Tax 
Journal 33 (June): 171—175. 

, and S. Lundstedt 1976. Understanding tax evasion. Public Finance 31: 295—305. 

Tversky, A., and D. Kahneman. 1986. Rational choice and the framing of decisions. The Journal of Business 
59: 8251-278. 

Vogel, J. 1974. Taxation and public opinion in Sweden: An interpretation of recent survey data. National 
Tax Journal 27: 499—513. 

Walster, E., E. Berscheid, and G. W. Walster. 1973. New directions in equity research. Journal of 
Personality and Social Psychology 25: 151-176. 

, G. W. Walster, and E. Berscheid. 1978. Equity: Theory and Research. Boston, MA: Allyn and 
Bacon, Inc. 

Webley, P., H. Robben, H. Elffers, and D. Hessing. 1991. Tax Evasion: An Experimental Approach. 
European Monographs in Social Psychology. Cambridge and New York, NY: Cambridge University 
Press. 














Ss 





, and I. Morris. 1988. Social comparison, attitudes and tax evasion in a shop simulation. 
Social Behavior 3:219—28. | 

Witte, A., and D. Woodbury. 1985. The effect of tax laws and tax administration on tax compliance: The 
case of the U.S. individual income tax. National Tax Journal 38 (March): 1—13. 

Yankelovich, Skelly, and White, Inc. 1984. Taxpayer Attitudes Survey: Final Report. Public Opinion 
Survey Prepared for the Public Affairs Division. New York, NY: Internal Revenue Service. 

Yitzhaki, S. 1974. Income tax evasion: A note. Journal of Public Economics 3 (May): 201—202. 


THE ACCOUNTING REVIEW 
VoL 70, No. 4 

October 1995 

pp. 635-653 


The Effects of Income and 
Consumption Tax Regimes and 
Future Tax Rate Uncertainty on 

Proportional Savings and 

Risk-Taking 


Janet A. Meade 


University of Houston 


ABSTRACT: This paper uses an experimental design to examine how income and 
consumption tax regimes and future tax rate uncertainty affect proportional savings 
and risk-taking. For the experiment, undergraduate subjects were given certificates 
redeemable for goods and services at two time periods. They were then asked to 
determine the time at which they wished to redeem the certificates and the manner 
in which they wished to allocate unredeemed certificates between safe and risky 
investment funds. The results indicate that when future tax rates are certain, an 
income tax regime reduces proportional! savings and increases proportional risk- 
taking when compared to a consumption tax regime. When future tax rates are 
uncertain, the effects are more complex. They generally suggest, however, that 
future tax rate uncertainty adversely affects the savings and risk-taking neutrality of 
a consumption tax regime while diminishing the risk-taking incentive of an income 
tax regime. 
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I. INTRODUCTION 


he mannerin whichinstitutional risk-sharing arrangements affect behavioris a fundamen- 
tal policy issue with broad implications for accountants. Among the questions prompted 
by such arrangements are those regarding the effects of professional liability insurance on 
audit performance, regulatory recommendations on firm disclosures, and contractual agreements 
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on owner, creditor and manager actions. Within the tax domain, one such question currently 
under debate concerns the effects of income and consumption tax regimes on savings and risk- 
taking (e.g. Anderson 1994; Eisner 1995; Gray 1995; Toder 1995). Policy-makers are particularly 
concerned about this question because of the role savings and risk-taking play in maintaining 
relative living standards, business productivity, international competitiveness and economic 
growth.! 

Analytical research on the effects of income and consumption tax regimes on savings and 
risk-taking by Ahsan (1989, 1990) indicates that, when compared to a no-tax world, an income 
tax regime decreases the proportion of savings undertaken by individuals, but increases the 
proportion of savings invested in risky assets. A consumption tax regime, in comparison, does not 
alter proportional savings or risk-taking from their no-tax levels so the regime is neutral with 
respect to savings and investment decisions. No empirical research to date, however, has directly 
tested these analytical conclusions. In addition, no empirical or analytical work has examined the 
savings and risk-taking effects of the two tax regimes when future tax rates are uncertain. 

This paper reports the results of an experimental test of the proportional savings and risk- 
taking effects of income and consumption tax regimes under conditions of certain and uncertain 
future tax rates. For this test, 90 undergraduate subjects were assigned to one of two groups 
simulating income and consumption tax regimes. Each group was then required to make savings 
and risk-taking decisions for three repeated treatments, one containing a certain tax rate on both 
present and future transactions, another containing a certain tax rate on present transactions but 
an uncertain tax rate on future transactions, and a third without any tax rate. 

Subjects began the experiment with certificates redeemable for goods and services at two 
time periods: the 30-day period immediately following the experiment or the 30-day period 
commencing three months later. Certificates not redeemed in the immediate 30-day period could 
be invested in two investment funds offering either safe or risky returns. The decision task of the 
experiment required the subjects to determine the time at which they wished to redeem certificates 
and the manner in which they wished to allocate unredeemed certificates between the two 
investment funds. 

In general, the results of the experiment support Ahsan's analytical work regarding the 
savings and risk-taking effects ofthe two tax regimes when present and future tax rates are certain. 
Subjects assigned to the income tax regime group were found to save proportionally less and to 
allocate proportionally more of their savings to risky investment than those assigned to the 
consumption tax regime group. Likewise, when the future tax rate was uncertain, proportional 
savings and risk-taking were lower and higher, respectively, forthe income tax regime group than 
the consumption tax regime group. The difference in the proportional savings of the two tax 
regime groups, however, was just significant at the .10 level for the uncertain future tax rate. 

Comparing the effects of certain and uncertain tax rates within the two tax regime groups, 
both proportional savings and risk-taking were found to be lower for the consumption tax regime 
group when the future tax rate was uncertain than when it was certain. For the income tax regime 
group, a similar reduction in proportional risk-taking was found. The reduction in proportional 
savings, however, was not significant. 


! Relative living standards, business productivity, international competitiveness and economic growth are directly 
affected by capital investment, and capital investment is directly affected by savings and risk-taking. Low rates of 
savings and risk-taking generally are believed to have an adverse effect on an economy because of their relationship 
with other economic variables. Policy-makers often attempt to positively influence savings and risk-taking behavior by 
changing fiscal and tax incentives (for more complete discussions of the economic effects associated with savings and 
risk-taking, see Eisner 1994; Kosters 1992; Toder 1995). 


Meade—The Effects of Income and Consumption Tax Regimes and Future Tax Rate Uncertainty 637 


Taken together, the results of the study imply that when both present and future tax rates are 
certain, a consumption tax regime is neutral with respect to proportional savings and risk-taking, 
while an income tax regime decreases and increases proportional savings and risk-taking, 
respectively. In comparison, when the future tax rate is uncertain, the savings and risk-taking 
neutrality of a consumption tax regime is adversely affected and the risk-taking incentive of an 
income tax regime is diminished. 

The remainder of this paper is organized as follows. Section II explains the theoretical 
background and hypotheses of the paper. Section III describes the experimental method used to 
test the hypotheses. Section IV presents the statistical results. Section V summarizes the findings 
and suggests directions for future research. 


Il. THEORETICAL BACKGROUND AND HYPOTHESES 


Income and consumption tax regimes are two major forms of broad-based taxation that create 
substantially different savings and risk-taking incentives.? According to the analytical work of 
Ahsan (1989, 1990), an income tax regime, by including savings in the tax base, reduces the after- 
tax rate of return on savings and provides an incentive for taxpayers to shift consumption from 
future periods to the present. The total amount of savings undertaken, as well as the proportion 
of after-tax income allocated to savings, are less than in a no-tax world. However, the proportion 
of savings invested in risky assets is greater than a no-tax world because, in addition to reducing 
the after-tax rate of return on savings, the regime also reduces the variability of the return. 
Taxpayers consequently are encouraged to allocate a larger share of their savings to risky 
investment since the after-tax level of risk associated with such investment is less. 

In a consumption tax regime, Ahsan (1989, 1990) shows that the exclusion of savings from 
the tax base makes the regime neutral with respect to current and future consumption decisions. 
Consequently, the proportion of savings undertaken is the same as that of a no-tax world.? 
Similarly, the proportion of savings allocated to risky investment is the same as that of a no-tax 
world because the size of the savings portfolio and the return on savings are not directly affected 
by the consumption tax. Instead, they are only affected when consumed. 

In deriving his conclusions, Ahsan (1989, 1990) uses a two-period model of savings and 
portfolio behavior which assumes that taxpayers act to maximize their utility from consumption 
over both periods and that they exhibit constant relative risk aversion (CRRA).* Additionally, the 
model assumes that both the income and consumption tax regimes are proportional with no 
limitation on the deductibility of losses. 

In the first period of the model, taxpayers hold an endowment representing the sum of the 
current value of their net assets at the beginning of the first period and the earnings of that period. 


? An income tax regime includes in the tax base the sum of an individual's consumption during the taxable period and 
the change in his/her net worth. A consumption tex regime differs from an income tax regime in that it excludes from 
the tax base the net change in the individual's savings during the period. In its most basic form, a consumption tax is 
collected directly from the taxpayer and is based on the total amount spent on consumption during the tax period. As 
such, it varies from a sales tax or value-added tax which, although also based on consumption, are collected from 
producers and sellers (Bradford 1986). 

? The analytical work of Hunt and Enis (1989) reaches a similar conclusion regarding the differential savings effect of 
income and consumption tax regimes. Their work, however, does not directly discuss the proportional savings effect. 
, In addition, it does not examine the risk-taking effects of the two tax regimes. 

CRRA preferences are widely used in the economic risk- -taking literature and are asserted by Sinn (1983) to be the only 
class of von Neumann-Morgenstern preferences that are consistent with existing psychological work. Individuals who 
exhibit CRRA preferences are insensitive to multiplicative transformations of a lottery with a given expected value, but 
sensitive to additive wealth changes. 
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They then allocate this endowment between consumption and savings. The portion of the 
endowment allocated to savings may be invested in two assets, one with a certain return and the 
other with a risky return. In the second period, they consume both the savings portion of the 
endowment and the return on those savings. 

After comparing the proportional savings and risk-taking effects of income and consumption 
tax regimes, Ahsan (1989) examines the sensitivity of his model’s conclusions over a broad range 
of parameter values. He also contrasts the effects of the regimes with equal revenue and utility 
tax rates, using a no-tax world as the benchmark. In general, his results indicate that when the tax 
rates of the income and consumption tax regimes are set to raise equal expected revenue, both 
regimes lead to an equal loss of expected utility (versus a no-tax world). However, with such rates 
the income tax regime's savings disincentive is stronger for lower levels of relative risk aversion, 
while its risk-taking incentive is uniform across all levels of relative risk aversion. The savings 
andrisk-taking neutrality of the consumption tax regime, in comparison, is unaffected by the level 
of relative risk aversion. | 

Ahsan’s (1989, 1990) model assumes that the tax rate on both present and future transactions 
is constant and certain. This assumption, however, limits the generalizability of his conclusions 
because it ignores the frequency with which the tax rate schedule has explicitly changed during 
the past 15 years (Economic Recovery Tax Act of 1981; Tax Reform Act of 1986; Revenue 
Reconciliation Act of 1993), as well as the almost yearly implicit changes resulting from 
modifications in the tax base (e.g., Tax Equity and Fiscal Responsibility Act of 1982; Deficit 
Reduction Act of 1984; Revenue Reconciliation Act of 1987; Technical and Miscellaneous 
Revenue Act of 1988; Revenue Reconciliation Acts of 1989 and 1990; Tax Extension Act of 
1991). In reality, decisions regarding future consumption are subject to an additional source of 
risk not included in his model—tax rate uncertainty. 

When Ahsan's model is extended to include an uncertain future tax rate having the same 
expected value as the present tax rate, his conclusions are slightly different. The additional risk 
associated with the uncertain future tax rate serves as a disincentive to savings for future 
consumption among taxpayers who exhibit CRRA. The proportion of savings undertaken by 
these taxpayers, therefore, declines below that which would occur with a certain future tax rate 
for either income or consumption tax regimes. In other words, the savings discouragement of an 
income tax regime is intensified, while the savings neutrality of a consumption tax regime is 
negatively disturbed. The magnitude of these adverse savings effects, as well as the extent to 
which the two regimes differ in their proportion of savings, are dependent on the degree of 
uncertainty in the future tax rate and the relative rates of the regimes. 

The effect of future tax rate uncertainty on proportional risk-taking is similar. The additional 
risk associated with the uncertain future rate causes CRRA taxpayers to reduce the proportion of 
their savings invested in the risky asset. This reduction occurs for both income and consumption 
tax regimes because CRRA taxpayers seek to hold their overall risk level constant, irrespective 
of the structure of the tax regime. The risk-taking incentive of an income tax regime is 
consequently moderated, while the risk-taking neutrality of a consumption tax regime is 
adversely altered. The intensity to which proportional risk-taking is affected, as well as the extent 
to whicli the two regimes differ in their proportional risk-taking, are again dependent on the 
degree of uncertainty in the future tax rate and the relative rates of the two regimes. 


5 A secondary aspect of Ahsan's (1989, 1990) analysis involves the savings and risk-taking effects of a wealth tax regime. 
Because a wealth tax regime is equivalent to a consumption tax regime in its effect on the life-time budget constraint, 
it is not examined in this paper. 
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To compare the specific savings and risk-taking effects of future tax rate uncertainty for the 
two tax regimes, four qualifying conditions must be added to the analysis. These conditions 
require that (1) the expected revenues and utilities of the two regimes are equal, (2) the before- 
tax rates of return on savings for the twa regimes are identical, (3) the probability assessments 
about the two regimes’ future tax rates are equivalent, and (4) the level of relative risk aversion 
exhibited by CRRA taxpayers is close to unity. 

The first two conditions are explicit requirements necessary to ensure that the risk attributable 
to the uncertain future rates is comparable. As such, they also correspond with the conditions used 
by Ahsan (1989) in his numerical analysis of the savings and risk-taking effects of certain future 
tax rates. The third condition is a supplemental requirement neéded to equalize the level of risk 
arising from different assessments about the likelihood and distribution of the future tax rates. 
When combined with the earlier assumption of equal present and expected future tax rates, it 
establishes parallel tax rate expectations for the two regimes. The fourth condition is an analytical 
requirement based on the risk-taking analysis of Arrow (1974). It confines relative risk aversion 
to a tractable range and allows for behavioral predictions. 

Given the preceding conditions, it is possible to draw general inferences regarding the 
differential savings and risk-taking effects of income and consumption tax regimes subject to 
future tax rate uncertainty. Specifically, it can be shown that an uncertain future tax rate causes 
the proportional savings of an income tax regime to be less than that of aconsumption tax regime. 
Concurrently, an uncertain future tax rate causes the proportional risk-taking of an income tax 
regime to be greater than that of a consumption tax regime.* The differential effects of income 
and consumption tax regimes posited by Ahsan (1989, 1990) for certain future tax rates thus 
continue in the presence of future tax rate uncertainty. 

Despite the current interest in the savings and risk-taking effects of income and consumption 
tax regimes among accountants, attorneys, economists and policy-makers (e.g., American 
Council for Capital Formation 1994, 1995; Kosters 1992; National Research Council Board on 
Science, Technology, and Economic Policy 1994; Nunn et al. 1992; Pollack 1995), no empirical 
research has yet tested the basic conclusions of Ahsan’s (1989, 1990) analytical work or the 
disincentive effects of future tax rate uncertainty. Instead, the models and methods of previous 
empirical studies have been motivated by other concerns. Empirical research on savings, for 
example, generally has employed econometric models to examine the relationship between 
savings and after-tax returns. These studies’ findings, however, have been inconclusive. Several 
investigations indicate that proportional savings would be less for an income tax regime than a 
consumption tax regime because a positive relation exists between the real after-tax interest rate 
and savings (e.g., Boskin 1978; Boskin and Lau 1988; Makin and Couch 1989; Skinner and 
Feenberg 1990; Summers 1981, 1984). Others suggest that proportional savings would not be 
affected by the structure of the tax regime because consumption patterns are unpredictable and 
insignificantly influenced by the after-tax interest rate (e.g., Evans 1983; Friend and Hasbrouck 
1983; Gravelle 1992; Hall 1988; Howrey and Hymans 1980). 

In contrast to the wealth of empirical studies on savings, the empirical research on risk-taking 
is more sparse. To date, only two studies have been conducted in this area, and both have utilized 
an experimental economics methodology (King and Wallin 1990; Swenson 1989). Despite a 
consistency in their findings, both studies fail to provide much insight into the differential effects 
of income and consumption tax regimes under conditions of certain and uncertain future tax rates 


$ Details regarding the manner in which the differential proportional savings and risk-taking effects were derived are 
available upon request. 


- 
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because their primary focus was the risk-taking effects of an income tax regime with a certain tax 
rate. The findings of these two studies do suggest, nonetheless, that a proportional income tax 
regime with a certain tax rate has little or no impact on investment in risky assets. 

This study uses an experimental design to examine the proportional savings and risk-taking 
effects of income and consumption tax regimes under conditions of certain and uncertain future 
tax rates. The findings of the study provide general insight into the behavioral effects of 
institutional risk-sharing arrangements, as well as specific insight into the disincentive effects of 
taxation. Initially, two research hypotheses are tested based on the analytical work of Ahsan 
(1989, 1990): 


Hla: Proportional saving is lower for an income tax regime with a certain tax rate than for 
either a consumption tax regime with a certain tax rate or a regime without a tax. 


Hlb: Proportional risk-taking is greater for an income tax regime with a certain tax rate than 
for either a consumption tax regime with a certain tax rate or a regime without a tax. 


Six additional research hypotheses are tested based on the extension of Ahsan's coe 1990) 
work to include the effects of uncertainty in the future tax rate: 


H2a: Proportional saving is lower for an income tax regime with an uncertain future tax rate 
than for a consumption tax regime with an uncertain future tax rate. 


H2b: Proportional risk-taking is greater for an income tax regime with an uncertain future 
tax rate than for a consumption tax regime with an uncertain future tax rate. 


H3a: Proportional saving is greater for an income tax regime with a certain future tax rate 
than for one with an uncertain future tax rate. 


H3b: Proportional risk-taking is greater for an income tax regime with a certain future tax 
rate than for one with an uncertain future tax rate. 


H4a: Proportional saving is greater for a consumption tax regime with a certain future tax 
rate than for one with an uncertain future tax rate. 


H4b: Proportional risk-taking is greater for a consumption tax regime with a certain future 
tax rate than for one with an uncertain future tax rate. 


In essence, the savings hypotheses of H1a, H2a, H3a and H4a posit that S =S >S p Sp > Sy 
where S denotes proportional savings, the first subscript refers to the tax regime (income, 
consumption, none) and the second subscript refers to the future tax rates (certain, uncertain) 
where applicable. Likewise, the risk-taking hypotheses of Hib, H2b, H3b and H4b propose that 
R, > Re = R, > Rn and R, > Ry > Ra where R denotes proportional risk-taking. No hypotheses 
are posited for the relationship between proportional savings for a consumption tax regime with 
an uncertain future tax rate and an income tax regime with a certain future tax rate. The specific 
nature of this relationship depends on the extent of uncertainty in the future tax rates. For a similar 
reason, no relationship is posited between proportional risk-taking for an income tax regime with 
an uncertain future tax rate, that of a consumption tax regime with a certain future tax rate, and 
that of a no-tax regime. 


Meade—The Effects of Income and Consumption Tax Regimes and Future Tax Rate Uncertainty 641 


IL METHOD 
Experimental Design 


The experiment utilized a repeated measure, between-subjects design with two groups and 
three treatments. The two between-subjects groups simulated income and consumption tax 
regimes. The three within-subject treatments varied the certainty of future tax rates and the 
existence of a tax. In both the income and consumption tax regime groups each subject received 
in random order a control treatment in which no tax was mentioned or assessed, a certain tax rate 
treatment in which the present and future tax rates were fixed at a single rate, and an uncertain tax 
rate treatment in which the present tax rate was fixed but the future tax rate was variable. In this 
latter treatment, the expected future tax rate was the same as the present tax rate, but was 
distributed with a 50 percent probability of being either higher or lower. 


Subjects 


Ninety undergraduate subjects from advanced accounting classes were selected to participate 
in the experiment from a pool of 142- volunteers. Selection was based on the results of a pre-test 
. questionnaire which assessed the subjects’ risk and savings preferences. The risk portion of the 
questionnaire elicited certainty equivalents for two series of lottery questions. The first series used 
Kachelmeier's (1990) price judgment instrument to identify risk-averse subjects. The second 
series examined the subjects' sensitivity to multiplicative transformations of lottery values and, 
thus, indicated relative risk preferences. Together, the results of these two series were used to 
select only those subjects who exhibited similar and constant relative risk aversion.’ 

Although not used as a selection device, the savings portion of the questionnaire was 
employed to classify subjects into three strata. Subjects from these strata were then randomly 
assigned to the two tax regime groups. À similar procedure was followed with respect to the pre- 
test risk measures of the subjects. The resulting tax regime groups consequently exhibited 
equivalent pre-test savings and risk-taking preferences. 

Compensation for participation in the experiment was received in the form .of certificates 
redeemable for goods and services at various retail establishments located near campus and 
catering to college students. Certificates were used rather than cash because the theoretical 
construct under investigation involved consumption behavior. The use of certificates, which 


? Subjects were excluded from the experimental sample when the pre-test questionnaire indicated that they were neither 
risk-averse nor constant in their risk preferences. The Berg et al. (1986) method of inducing risk preferences was not 
used for this study because such an induction method would have been a joint test of the risk-induction technique and 
the analytical effects of income and consumption tax regimes (for a more complete discussion of the limitations of the 
Berg et al. risk-induction technique, see Davis end Holt 1993). However, to check the reasonableness of tbe risk 
preference classification, a post-experimental analysis of the subjects’ risk-taking behavior was conducted. For this 
analysis, the relative risk undertaken by the subjects, including tax-induced risk, was measured by computing E 
coefficients of variation for each subject’s investment portfolio in each of the three repeated tax treatments. The 
calculation of these relative risk measures involved dividing the standard deviation of the potential after-tax returns of 
each investment portfolio by the mean expected after-tax return. When the future tax rate was certain, only the income 
tax regime group was subject to a tax-induced change in the risk of their after-tax returns. When the future tax rate was 
uncertain, both the income and consumption tax regime groups were subject to tax-induced changes. Different relative 
risk measures consequently were possible for the various tax treatments even when the same percentages of a subject's 
investment portfolio were invested in the safe and risky funds. The results of this analysis suggested that the subjects 
exhibited similar relative risk aversion, that their behavior was consistent withthe CRRA assumption, selene 
taking decisions were not significantly affected by differences in the magnitudes of the tax treatments’ 

The analysis, however, did not indicate whether the subjects’ level of relative risk aversion was close to sain ecu 
no such empirical measurement was possible. 
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could not be redeemed in part or whole for cash, avoided confounding consumption and savings 
behaviors among subjects having differing post-experimental investment opportunities.® 

To control for wealth effects across the tax rate treatments, only one of the subjects’ decision 
sets from the three treatments was used to determine the number of certificates received. The 
specific determination of which decision set to use was made by the subjects at the end of the 
experiment when they were asked to roll a single die. If they rolled a one or two, their payoff was 
based on their first set of decisions. If they rolled a three or four, their payoff was based on their 
second set of decisions. If they rolled a five or six, their payoff was based on their third set of 
decisions. The rationale underlying this random-selection method of compensating the subjects 
was that, by treating each decision set as separable and equally likely, it required those subjects 
who wished to maximize their expected payoff to make independent and optimal decisions for 
each of the three treatments. Changes in wealth during one of the treatments consequently should 
have had no effect on the decisions for the other treatments (Davis and Holt, 1993). 


Task 


The task was administered to the subjects in separate rooms of a university behavioral 
laboratory. Upon entering the laboratory room, the subjects were given written instructions and 
shown 1,400 certificates, denominated in amounts of one, ten, 24, 50 and 100. They were then 
told, via the instructions, that the experimental task would require them to make three sets of 
decisions under different conditions and their payoff from these decisions would be randomly 
determined at the end of the experiment when they rolled a die. 

As to the form of their payoff, the subjects were told that they would receive certificates 
redeemable for goods and services at any or all of five different retail establishments. The five 
establishments consisted of a computer software dealer, a copying center, a fast-food outlet, a 
bookstore and university merchandiser, and a compact disc and tape store. The rate of redemption 
for their certificates, the subjects were informed, was 100 certificates for $1 of goods and 
services.? 

In the treatment without a simulated tax, neither the income nor consumption tax regime. 
groups received any additional information regarding the redemption rate. In the certain tax rate 
treatment, however, the income tax regime group was further told that an immediate redemption 
fee of 25 percent would be assessed on their entire 1,400 certificates and that another redemption 
fee of 25 percent would be levied on any gains, net of losses, resulting from subsequent investment 
of unredeemed certificates (discussed below). The instructions also specified that for losses in 
excess of gains, the redemption fee would be waived and that certificates equal to 25 percent of 
the excess losses would be granted. 


5 The receipt of cash payoffs could have confounded consumption and savings behaviors because some subjects might 
have selected to receive the entire amount of their cash payoffs at the first available opportunity and, thus, appeared 
within the experimental design to be making a consumption decision when, in actuality, they intended to invest their 
payoffs outside the experimental setting in higher-yielding investments than those afforded by the 

? Denominated certificates of less than 100 were printed in rolls so that they could be carried, counted and redeemed easily. 
To redeem certificates of less than 100, the subjects unwound the appropriate number and tore this amount off along 
a perforated edge. Different colors were used to aid in distinguishing the various denominated amounts. At the end of 
the experiment, when the subjects actually received their certificates, those subjects inthe consumption tax regime group 
whose payoffs were based on their decisions for the certain and uncertain tax rate treatments received denominated 
certificates that reflected the redemption fee. For example, the denominated certificates received by the subjects in the 
certain tax rate treatment were in amounts of 1.4, 14, 35, 70 and 140 and were redeemable for goods and services having 
a cash value of 12, 10g, 25g, 50¢ and $1.00, respectively. Similar adjustments were made to the certificates received 
by the subjects in the uncertain tax rate treatment. 
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The instructions of the certain tax rate treatment for the consumption tax regime group 
’ differed from those for the income tax regime group in that the subjects were informed that each 
time a certificate was redeemed, a redemption fee equal to 40 percent of the value of the goods 
and services received would be assessed. They were told that a redemption fee would not be 
imposed, however, until the time of a redemption. No information was provided regarding the 
treatment of gains and losses since these amounts were automatically included in the total number 
of certificates redeemed. 

Both tax regime groups received instructions for the uncertain tax rate treatment that were 
similar to those of their certain tax rate treatment. The distinguishing difference of the uncertain 
tax rate treatment was that the future redemption fee was presented as having an equal probability 
of being a rate either higher or lower than that of the present redemption fee. For the income tax 
regime group, the two future redemption rates were ten and 40 percent and were applicable to 
gains and losses on invested certificates. For the consumption tax regime group, the two future 
redemption rates were 20 and 60 percent and were applicable to both invested certificates and net 
earnings. 

The primary decision required by the experimental task for all three tax rate treatments 
involved the timing of current and future consumption; a secondary decision concerned the 
riskiness of investment. Before making either decision, the subjects were told that they had two 
opportunities to redeem certificates. The first opportunity was within the 30-day period imme- 
diately following the experiment. The second opportunity was after a lapse of three months, when 
certificates could again be redeemed within a 30-day period. The redemption opportunities were 
presented in a manner that allowed the subjects to select one or both redemption times for part or 
all of their certificates. : 

As an incentive to temporarily postpone redemption, the subjects were permitted to invest 
unredeemed certificates for the three-month period in one or both of two investment funds. One 
of these funds, Fund A, promised to pay a certain before-tax return of 50 percent on invested 
certificates. The other fund, Fund B, offered to pay a before-tax return of either 100 or 25 percent. 
The probabilities of these returns were stated as 80 and 20 percent, respectively.!? 

The instructions specified that if a decision was made to invest any certificates, the 
accumulated value of these certificates (after payment of applicable redemption fees for the 
 incometax regime group) would be mailed to the subjects at the end of the three-month period.!! 
The only opportunity cost directly associated with an investment decision, therefore, arose from 
the three-month delay in redeeming certificates. 


10 Although the expected after-tax returns and relative economic spreads across the savings alternatives differed for the 
three tax treatments and two tax regime groups, the differences resulted from inherent features of the tax regimes rather 
than inequivalencies in the selected rates of return or tax. For the certain and uncertain consumption tax rate treatments, 
the expected after-tax rates of return were identical to the rates of return for the treatments without a tax. These rates 

because a consumption tax regime, by design, has no effect on return. In comparison, the expected after- 
tax rates of return for the certain and uncertain income tax rate treatments were lower than the rates of return for the 
certain and uncertain consumption tax rate treatments and no-tax treatments because the design of an income tax 
inherently lowers the return on savings. Equalization of the expected after-tax returns would have required the usé of 
either disparate before-tax returns or a zero-rate income tax. Neither of these two alternatives was viable in the context 
of the study because the first would have confounded the results with an exogenous market rate adjustment while the 
second would bave negated the significance of the research questions under investigation. 

" Each subject who invested in the funds was mailed the appropriate number of certificates at the end of the 12th week 
following the experiment. Allowing time for mail delivery, the certificates should have been received within one or two 
days of the three-month investment period. No subjects reported not receiving their certificates and only three percent 
of the mailed certificates were not redeemed within the 30-day redemption period. 
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After the subjects finished making their decisions for each of the three tax rate treatments, 
they were asked to roll a die to determine which of their decision sets would be the basis of their 
payoff. Subjects who selected the decision set for the uncertain tax rate treatment were asked to 
roll a second die to determine the appropriate rate of their future redemption fee. Following this 
determination, all subjects were given the number of certificates they wished to redeem in the 
immediate 30-day period and asked to complete a short post-test questionnaire. 

The intent of the post-test questionnaire was to assess the effectiveness of the manipulated 
variables and experimental procedures. Responses to this questionnaire indicated that all of the 
subjects understood the relevance of the redemption fee to their payoffs and the key difference 
between the two investment funds. In addition, the post-test data showed that 83 subjects 
considered their payoffs (and expected future payoffs) to be adequate, 85 understood the factors 
that determined their payoffs, and 88 had no knowledge of either the parameter values of the other 
tax regime group or the payoffs to other subjects. The final value of the subjects' payoffs, after 
inclusion of future investment returns and reduction for assessed redemption fees, was between 
$6.56 and $28.00, with an average of $15.23. For most subjects, the experimental task lasted 
approximately 50 minutes; an additional three to ten minutes was required to redeem the 
certificates at the retail establishments. 


Parameter Values 


In selecting the tax rates and investment rates of return for the experiment, three criteria were 
employed. First, the tax rates of the income and consumption tax regime groups were required to 
generate analytically equivalent expected revenue and utility.!? Second, the expected before-tax 
rates of return on savings for the two tax regime groups were required to be identical.!? Third, the 
future tax rates of each regime group were required to follow a dichotomous probability 
distribution with an expected value equal to that group's present tax rate. 

The first two selection criteria were based on the equivalency conditions used by Ahsan 
(1989) in his numerical analysis of certain future tax rates. In his analysis, Ahsan demonstrated 
that when income and consumption tax regimes provide expected before-tax returns on safe and 
risky assets of 50 and 75 percent, respectively, an income tax rate of 25 percent and a consumption 
tax rate between 38 and 41 percent (depending on the degree of relative risk aversion) are 
equivalent in terms of both expected revenue and utility. For the experiment, a consumption tax 
rate of 40 percent was selected over the alternative rates of 38, 39 or 41 percent for computational 
ease. 

The third criterion was based on the third condition of the uncertainty analysis presented 
earlier. When combined with the first two selection criteria, it ensured that the selected tax rates 
and investment rates of return satisfied the primary conditions underlying the hypothesized 
relationships in an experimentally efficient manner. Neither it nor the other two selection criteria, 


2 Without knowledge of the subjects’ utility functions, it was not possible to determine a priori the tax rates at which the 
income and consumption tax regime groups would generate equivalent revenue and utility. The selection criteria 
consequently were based on analytical rather than experimental equivalency. Ex post calculations based on the subjects’ 
behavior during the experiment indicated that the two tax regime groups generated approximately the same amount of 
revenue. No ex post calculation of the subjects' utility was possible. | 

13 Because of inberent differences in the design of income and consumption tax regimes, it was not possible for the two 
simulated tax regimes to have the same expected before- and after-tax rates of return. To have equalized the expected 
after-tax rates of return for the two tax regime groups would have required the use of either dissimilar before-tax rates 
of return or a zero-rate income tax. As discussed in footnote 10, however, neither of these two alternatives was viable 
in tbe context of the study. 
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however, provided any assurance that the level of relative risk aversion exhibited by the subjects 
during the experiment would be close to unity. Instead, this condition was assumed to be satisfied 
by the pre-screening progress. 

The three-month time requirement for invested certificates was adopted for two reasons. 
First, this time requirement proved acceptable to the participating establishments when soliciting 
their cooperation in redeeming certificates. Second, it avoided potential problems with graduat- 
ing or transferring subjects whose decisions, under a longer time requirement, might have been 
affected by post-experimental factors, such as future career plans or travel.’ 


Savings and Risk-Taking Measures 


Two dependent measures were calculated for each subject in the three tax rate treatments. The 
first measure, representing proportional savings, was computed as the ratio of the number of 
certificates invested in both Funds A and B to the total number of certificates available for 
investment. For the consumption tax regime group, the denominator of this ratio was 1,400, or 
the total number of certificates given to each subject. For the income tax regime group, however, 
the denominator was 1,400 only when the treatment did not include a simulated tax; in the certain 
and uncertain tax rate treatments the denominator was 1,050. Use of this lower denominator was 
required for these two treatments because the simulated income tax reduced the number of 
available certificates by 25 percent. 

The second measure, representing proportional risk-taking, also was calculated as a ratio. It 
differed from the savings measure, however, in the numerator was the number of certificates 
invested in the riskier Fund B, while the denominator was the total number of certificates invested 
in both Funds A and B. When no certificates weré invested by a subject for a particular tax rate 
treatment, a risk-taking measure was not calculated and the observation was treated as missing. 
Irrespective of the subjects’ savings behavior, the calculation of the risk-taking measure was the 
same for all tax regime groups and rate treatments. Both the measures of proportional savings and 
risk-taking corresponded with those used in the analytical work of Ahsan (1989, 1990). 


III. RESULTS 


Tests of the effects of the tax regime groups and rate treatments on proportional savings and 
risk-taking were conducted using repeated-measures ANOV As with univariate and multivariate 
procedures. Univariate procedures were employed to test the between-subjects effects of the tax 
regime groups, while multivariate procedures were utilized to test the within-subject effects of 
the tax rate treatments. Multivariate procedures were employed for the within-subject tests 
because the circularity assumption was not tenable for either repeated measure, proportional 
savings or risk-taking. Tests of the assumptions required for the between-subjects univariate 
approach and the within-subject multivariate approach showed no significant violations.!6 


M By using a three-month time requirement, the entire experiment, from pre-test questionnaire to final payoff, was 
completed within one semester. A longer time requirement would have spanned more than a single semester and 
possibly introduced uncontrollable confounding effects. 

“Tests of within-subject effects using a univariate repeated-measures ANOVA are unbiased only when the variances and 
covariances of the repeated-measures display a particular patterned relationship known as circularity. Circularity 
requires that for all pairs of repeated measures the sum of their variances minus twice their covariance equals a constant. 
When the data do not satisfy the circularity assumption, multivariate procedures are appropriate. 

!5In addition to testing for violations of the assumptions underlying the univariate and multivariate models, the Box-Cox 
procedure was employed to determine if a power transformation or angular transformation could better minimize the 
error sum of squares. The results of this procedure indicated that the untransformed data produced the minimum error 
sum of squares. The results consequently are reported using the untransformed data. 
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The results of the repeated-measures ANOVAs are reported in table 1. For both the measures 
of proportional savings and risk-taking, the main effects of the tax rate treatment and the 
interactions of the rate treatments and regime groups are significant (p = .000). In comparison, 
the main effects of the tax regime groups are not significant (p = .148 and .075); but this finding 
is not interpretable in light of the significant interaction and requires further examination. 
Graphical illustrations of the interactions are presented in figures 1 and 2. 

Figures 1 and 2 reveal that, in the absence of a tax, the income and consumption tax regime 
groups undertook approximately equal proportions of savings and risk-taking. When a certain tax 
rate was considered, however, the income tax regime group decreased and increased its levels of 





TABLE 1 
Repeated Measure ANOVA Tests of Differences for the Tax Regime Groups and Tax 
Rate Treatments 
Sum of Mean Wilks' 
Source of Variation df Squares Square Lambda F-statistic p-value 
Panel A. Proportional Savings: 
Between-subjects: 
Tax regime group I 4,796.80 4,796.80 2.13 .148 
Error 88 198,602.68 2,256.85 
Within-Subject: 
Tax rate treatment 2 7,042.00 3,521.04 .532 38.29* .000 
Interaction of 
tax rate treatment 
with regime group 2 3,340.94 — 1,670.47 .742 15.11* .000 
. Error 176 16,659.29 94.66 
Panel B. Proportional Risk-taking:** 
Between-subjects: 
Tax regime group 1 7,788.61 7,788.61 3.27 .075 
Error 69 164,172.55 2,379.31 
Within-Subject: 
Tax rate treatment 2 3,773.06 — 1,886.53 .603 22.42* .000 
Interaction of 
tax rate treatment 
with regime group 2 2,366.81 1,183.41 -702 14.43* .000 
Error 138 10,801.84 78.277 


* The F-statistic is based on Wilks’ lambda criterion. 
** The general linear model is used to adjust for unequal sample sizes. 
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FIGURE 1 
Tax Regime Group by Tax Rate Treatment Interaction Effect on Proportional Savings 





Mean 
Savings 
Cell Means 
Standard 
n Mean Deviation 
Income Tax Regime Group: 
No-tax 45 42.49 32.83 
Certain tax rate 45 28.47 25.84 
Uncertain tax rate 45 23.60 22.76 
Consumption Tax Regime Group: 
No-tax 45 41.19 29.30 
Certain tax rate 45 43.57 31.95 
Uncertain tax rate 45 35.08 27.37 


proportional savings and risk-taking, respectively; the consumption tax regime group responded 
with little change. When asked to consider the uncertain tax rate, both groups responded by 
reducing their savings and risk-taking proportions. However, the magnitude of their responses 
differed. For the measure of proportional savings, the decline was almost twice as great for the 
consumption tax regime group as for the income tax regime group. For the measure of 
proportional risk-taking, the decline was approximately equal for both groups. 

Statistical analyses of the simple effects associated with the interactions were performed 
using Tukey’s HSD multiple comparison test. This test allowed the main effects of the tax regime 
groups and rate treatments to be examined separately. However, because the test, in its general 
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FIGURE 2 
Tax Regime Group by Tax Rate Treatment Interaction Effect on Proportional Risk-Taking 


100% 





15% 
Mean 
Proportional 
Risk-taking 
50% 
0% A a E, E. 
Cell Means No-tax Certain tax rate Uncertain tax rate 
Standard 
n Mean Deviation 
Income Tax Regime Group: 
No-tax 34 59.02 33.27 
Certain tax rate 34 74.46 29.01 
Uncertain tax rate 34 64.95 30.15 
Consumption Tax Regime Group: 
No-tax 37 56.35 28.18 
Certain tax rate 37 57.57 27.63 
Uncertain tax rate 37 48.20 26.15 


form, is not robust to violations of the circularity assumption, the residual mean squares of the 
specific contrasts of interest were used instead of the pooled residual mean squares (Boik 1981; 
Kirk 1982). In addition, because the sample sizes of the proportional risk-taking measure for the 
two tax regime groups differed, a modified version of the test proposed by Spjgtvoll and Stoline 
(1973) was used to test the between-subjects effect on this measure. 

Table 2 presents the resuits of the multiple comparisons between the two tax regime groups. 
As expected, no significant differences were detected between the proportional savings and risk- 
taking means of the income and consumption tax regime groups when they were not subject to 
atax (p».15). However, when the two groups were subject to a certain tax rate, their proportional 
savings and risk-taking behavior differed, with the income tax regime group saving proportion- 
ally less and risking proportionally more than the consumption tax regime group (p < .05). These 
findings support hypotheses H1a and H1b and lend credibility to Ahsan's (1989, 1990) analytical 
conclusions. 
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TABLE 2 
Tukey's HSD Multiple Comparison Test of Simple Between-Subjects Effects for the 
Tax Rate Treatments 


MS 
Tukey's HSD test: Critical value ga ar ^um 


Mean Computed 


Comparisons of Tax Regime Means df n Square Error Value p-value 
Panel A. Proportional Savings: 

No-tax 88 45 968.27 1.30 ».15 
Certain tax rate 88 45 844.20 15.10 <.05 
Uncertain tax rate 88 45 633.69 11.48 <.10 
Panel B. Proportional Risk-taking: 

No-tax 69 34* 943.65 2.67 >.15 
Certain tax rate 69 34* 800.67 16.89 «.05 
Uncertain tax rate 69 34* 791.54 16.75 <.05 


a, denotes the experiment-wise error rate and is the basis for the reported p-values. 


* Adjusted for unequal sample sizes using the Spjgtvoll-Stoline 7’ test, a generalization of Tukey’s HSD test. This test 
replaces n in the HSD denominator with n, the minimum of the compared sample sizes.. 





With respect to the effects of the uncertain tax rate treatment, similar differences between the 
two groups' proportional savings and risk-taking behavior were detected. The difference between 
the two groups' mean proportional savings, however, was just significant at the .10 level. Three 
possible explanations for this lack of strong support for hypothesis H2a are possible. First, the 
incometax regime group may not have viewed the uncertainty ofthe future tax rate as prominently 
as the consumption tax regime group because it affected only their earnings from invested 
amounts and these amounts were already low from the simulated tax. Second, the income tax 
regime group may have exhibited a floor effect whereby their level of proportional savings could 
not be reduced by the same amount as that of the consumption tax regime group. Third, the 
uncertainty in the manipulation of the future tax rate may not have been perceived as equivalent 
by the two groups, despite the analytical equivalence.!" 

The proportional risk-taking means of the income and consumption tax regime groups for the 
uncertain tax rate treatment were significantly different at conventional levels (p « .05). This 
finding supports hypothesis H2b and corroborates the effect observed in figure 2, where the 
differences between the means of the tax regime groups' proportional risk-taking measures 


17 The analytical equivalence of the uncertainty in the two tax regime groups’ future tax rate was based an Arrow’s (1974, 
111) analytical work on risk-taking which suggests a value for relative risk aversion close to unity. When relative risk 
aversion is substantially greater than one (i.e., four or more), this analytical equivalence can be maintained only if the 
consumption tax regime group saves proportionally less than that of tbe income tax regime group. The lack of support 
for hypothesis H2a consequently may be due to high levels of relative risk aversion among the subjects. However, 
neither thc pre-test questionnaire responses regarding risk nor the risk-taking behavior exhibited by the subjects during 
the no-tax treatment provide evidence of high levels of relative risk aversion. Any inequivalence in the savings behavior, 
therefore, would appear more likely to be attributable to perceptual or cognizant differences regarding the future tax rate 
manipulation. 
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appeared to be similar for both the certain and uncertain tax rate treatments. It also implies that 
the uncertain future tax rate manipulation was perceptually equivalent in terms of its effect on 
risk-taking behavior.!* 

The results of the multiple comparisons among the three tax rate treatments for the two regime 
groups are presented in table 3. As shown in panel A, the mean proportional savings measures of 
the income tax regime group were significantly less for the certain and uncertain tax rate 
treatments than for the no-tax treatment (p < .01). The difference between the group’s propor- 
tional savings means for the certain and uncertain tax rates, however, was not significant (p> .15). 
This latter finding does not support the posited effect of hypothesis H3a and may be attributable 
to the same factors as discussed earlier for hypothesis H2a. 

The proportional savings behavior of the consumption tax regime group was consistent with 
both the analytical conclusions of Ahsan (1989, 1990) and hypothesis H4a. Specifically, the 
group’s proportional savings means for the no-tax and certain tax rate treatments were found not 
to differ significantly (p > .15). This lack of a significant difference agrees with Ahsan’s (1989, 
1990) analytical work. It does not, however, provide interpretable evidence as to the savings 
neutrality of a consumption tax regime since many other factors also could account for such a 
result. More interpretable, therefore, are the findings of significantly greater mean proportional 
savings measures for the no-tax and certain tax rate treatments than for the uncertain tax rate 
treatment (p < .05). In particular, the significance of the comparison between the certain and 
uncertain tax rate treatments supports hypothesis H4a and indicates that future tax rate uncertainty 
can impair a consumption tax regime’s neutrality toward savings. 

With respect to proportional risk-taking, panel B of table 3 shows that the proportion of risk 
undertaken by the income tax regime group was significantly greater for the certain tax rate 
treatment than for the no-tax and uncertain tax rate treatments (p « .05). The difference in the mean 
proportional risk-taking measures of the group for the no-tax and uncertain tax rate treatments, 
however, was not significant (p > .15). The finding of significantly greater proportional risk- 
taking behavior for the certain tax rate treatment than for the uncertain tax rate treatment supports 
hypothesis H3b. Together with the finding of no significant difference between the uncertain and 
no-tax treatments, it also suggests that perceived uncertainty about future tax rates can counteract 
and reduce much the risk-seeking effect of an income tax regime. 

The proportional risk-taking behavior of the consumption tax regime group differed from that 
of the income tax regime group in two respects. First, the group did not exhibit a significantly 
different risk-taking response for the no-tax and certain tax rate treatments (p > .15), supporting 
Ahsan’s (1989, 1990) analytical conclusion regarding the risk-taking neutrality of a consumption 
tax regime. Second, the group took significantly more risk with its savings during the uncertain 
tax rate treatment than during the no-tax treatment (p « .05). But like the income tax regime group, 
the consumption tax regime group did invest significantly more of its savings in the risky asset 
during the certain tax rate treatment than during the uncertain tax rate treatment (p < .05). The 
latter finding supports hypothesis H4b. It and the earlier finding of support for hypothesis H3b 
jointly imply that uncertainty in the future tax rate can create an aversion to risk, irrespective of 
a tax regime’s structure. 

The preceding results have several implications for the broader savings and risk-taking 
relationships proposed earlier. The findings regarding proportional savings between the two tax 


18 The apparent perceptual equivalence in the risk-taking behavior of the two tax regime groups is based on the proportional 
risk-taking measure. This measure does not capture the total amount of risk undertaken by the groups or the difference 
in the groups’ return on and variability of savings. 
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TABLE 3 
Tukey’s HSD Multiple Comparison Test of Simple Within-Subject Effects for the Tax 
Regime Groups 


Tukey’s HSD test: Critical value aand [B 
n 


Comparisons of Tax Rate Means df n Square Error Value p-value 





Panel A. Proportional Savings: 
Income tax regime group: 
No-tax vs. certain tax rate 44 45 270.36 14.02 <.01 
No-tax vs. uncertain tax rate 44 45 276.06 18.89 <01 
Certain tax vs. uncertain tax rate 44 45 140.69 4.87 >.15 
Consumption tax regime group: 
No-tax vs. certain tax rate 44 45 160.05 2.38 >.15 
No-tax vs. uncertain tax rate 44 45 101.26 6.11 <.05 
Certain tax vs. uncertain tax rate 44 45 187.44 8.49 <.05 
Panel B. Proportional Risk-taking: 
Income tax regime group: 
No-tax vs. certain tax rate 33 34 338.45 15.44 <.05 
No-tax vs. uncertain tax rate 33 34 152.06 5.93 ».15 
Certain tax vs. uncertain tax rate 33 34 151.11 9.51 <.05 
Consumption tax regime group: 
No-tax vs. certain tax rate 36 37 59.98 1.22 ».15 
No-tax vs. uncertain tax rate 36 37 115.09 8.15 <.05 
Certain tax vs. uncertain tax rate 36 37 136.94 9.37 <.05 


a, denotes the experiment-wise error rate and is the basis for the reported p-values. 


regime groups and among the three tax rate treatments suggest an overall savings effect of 
S, ~ $, 2 S, Sp The findings regarding proportional risk-taking suggest an overall effect of 
R, > R, = Re > Ra and R, > R, > Ra Unsupported by the empirical tests is the proposed 
relationship of S S, > S, However, as mentioned earlier, three plausible-explanations exist for 
this lack of support. 


IV. CONCLUSION 


The experimental findings of this study suggest that institutional risk-sharing arrangements, 
such as the structure of a tax regime, can have significant and multidimensional effects on 
behavior. Specifically, this study found that an income tax regime discourages proportional 
savings and encourages proportional risk-taking when the tax rate is certain. In comparison, a 
consumption tax regime with acertain tax rate was found to be neutral with respect to proportional 
savings and risk-taking. 

When the future tax rates of the two regimes are uncertain, the study’s findings show that both 
proportional savings and risk-taking decline to levels below those which occur with certain future 
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tax rates. The magnitudes of the declines, however, are dissimilar for the two regimes. For an. 
income tax regime, the decline in proportional savings is not significantly different from that 
which occurs with a certain tax rate; the decline in proportional nsk-taking, however, is 
significant. For a consumption tax regime, the declines in both proportional savings and risk- 
taking are significantly different from their levels with a certain future tax rate. These findings 
jointly suggest that although institutional risk-sharing arrangements can simultaneously create 
incentive and disincentive effects, the introduction of uncertainty about the future operation of 
such arrangements can counteract or exacerbate such effects. 

Because the study was conducted using a laboratory experiment, several limitations restrict 
the generalizability of the findings (for a discussion of the advantages and disadvantages of 
experimental tax research, see Bonner et al. 1991). In particular, the use of exogenous rates of 
return and tax rather than rates determined in a market or political environment limits the 
applicability of the findings. The selection of subjects based on savings and risk preferences and 
the suppression of noneconomic factors such as societal norms, demographic-related preferences 
and infinite consumption-savings options also restricts the extent to which inferences can be 
drawn. Most important, however, the contextual nature of the study and its emphasis on tax- 
related effects limits generalizations to other, non-tax risk-sharing arrangements. Future research 
might assess the effects of other types of risk-sharing arrangements and investigate noneconomic 
factors and endogenous tax and savings rates. 
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ABSTRACT: This study examines the equity price reaction to two generally income 
increasing standards on accounting for income taxes, namely the Statements on 
Financial Accounting Standards (SFAS) No. 96 and No. 109. It is hypothesized that 
significant positive abnormal retums should be observed around the Exposure Draft 
dates. itis also hypothesized that the equity price reaction to these standards should 
be related to the magnitude of their Income effects and the economic consequences 
of a given income effect. The results are consistent with contracting and political cost 
hypotheses. 
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I. INTRODUCTION 


his study measures the equity price reaction to events leading to changes in accounting for 
income taxes. Utilizing the multivariate regression model proposed by Schipper and 
Thompson (1983), we show that firms exhibit significant positive abnormal returns (on 
average) around the issuance of the Exposure Drafts leading to SFAS No. 96 and No. 109. The 
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portfolio weighting procedure advocated by Sefcik and Thompson (1986) is used to show that the 
equity price reaction to these standards is related to the magnitude of their income effects and the 
economic consequences of a given income effect. 

Prior research has established that bad news (a mandated accounting change that decreases 
income and/or retained earnings and increases liability) is associated with negative abnormal 
returns for the affected firms. This paper shows that contracting and political cost hypotheses 
apply to good news as well, i.e., a mandated accounting change that increases income and/or 
retained earnings and decreases liability is associated with positive abnormal returns.! The study 
also highlights the offsetting effects on stock prices of various accounting rules. It shows that the 
income tax rule, by easing the impact on earnings of postretirement benefits rule, results in 
relatively higher positive abnormal returns for firms offering such benefits.? 

An important feature of this study is that the standards on accounting for income taxes do not 
unambiguously increase or decrease the acceptable set of accounting procedures, rather, they 
replace one acceptable method with another. Watts and Zimmerman (1986) argue that contracting 
costs are increased by standards that reduce the acceptable set of accounting procedures and 
reduced by standards that increase the acceptable set of accounting procedures. However, 
previous studies cannot distinguish the income effect argument from the acceptable set argument. 
For example, SFAS No. 106 causes both income to decrease and the acceptable set of procedures 
to be reduced. Thus, it is not clear to what extent each of these factors contributes to the negative 
stock price response documented by Espahbodi et al. (1991). In this paper, the positive stock price 
response is easily attributed to the income increasing nature of the standards. The results of this 
paper suggest that income effects alone have important economic consequences. 

The remainder of the paper is organized as follows. The hypotheses are developed in section 
II. Section III discusses the sample and data, events, and methodology. Results are reported in 
section IV and concluding remarks are made in section V. 


II. HYPOTHESES 


Since the issuance of Accounting Principles Board (APB) Opinion No. 11 in 1967, 
accounting for income taxes has been the subject of criticisms and concerns (mainly that failure 
to adjust deferred tax credits [liability] for changes in tax rates and laws rendered this liability on 
the balance sheet meaningless). Responding to these criticisms, the FASB added a project to its 
agenda in January of 1982 to reconsider accounting for income taxes that led to the issuance of 
SFAS No. 96 in December of 1987. The Board, however, delayed the adoption date of SFAS No. 
96 three times, owing to concerns about extensive and complex scheduling of temporary 
differences required by that statement, and the limitations on recognizing deferred tax assets. In 
February 1992, the Board issued SFAS No. 109, a) allowing companies to recognize deferred tax 
assets when it is “more likely than not" that such assets will be realized and b) reducing the need 
for extensive scheduling of temporary differences. 


! Ziebart and Kim (1987) examined the overall market reaction to an income increasing standard (SFAS No. 52), but they 
only associated the market response to the accounting method used prior to SFAS No. 8. Ziebart and Kim performed 
no test of contracting cost hypotbeses and their methodology was different. 

? We also test for the average and individual security price effects of the events using Schipper and Thompson's (1983) 
methodology, and for the differential impact of the accounting change for income taxes on various industries. Future 
mandated accounting change studies may thus take into consideration the impact of industry factor on abnormal returns 
in their design or sample selection. Test results are reported in footnotes 13 and 14. 
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Both SFAS No. 96 and No. 109 require an asset and liability approach to accounting for 
income taxes, and dictate that the deferred tax liability be adjusted for changes in tax rates and 
laws (a departure from APB Opinion No. 11). Companies are allowed to report the cumulative 
effect of the change in the method of accounting for income taxes either on the income statement, 
or as an adjustment to beginning retained earnings restating prior years’ financial statements 
reported on a comparative basis. A change from APB Opinion No. 11 to either SFAS No. 96 or 
No. 109, therefore, would cause the majority of companies to report a large one time increase in 
earnings and/or retained earnings and a decrease in liability because of the decline in tax rates from 
a maximum of 46 percent to 34 percent? The Wall Street Journal (January 21, 1988) reported 
increases in earnings from 18 to over 50 percent for some companies.‘ 

These pronouncements reduce the probability that existing debt covenants will be violated 
for firms experiencing an increase in earnings and a decrease in liability. For these firms, the 
expected cost of a technical default should decrease, resulting in a wealth transfer via existing debt 
contracts to owners and an increase in their stock prices (Watts and Zimmerman 1986, 286).° 
Stock prices of these firms should also increase if the standards decrease future debt contracting 
costs. Alternatively, announcements related to accounting for income taxes could increase 
political costs, especially for larger firms, resulting in stock price decreases. Stock prices of firms 

‘with income-based compensation plans could also decrease slightly, as the standards on income 
taxes expand managers' ability to increase their compensation (Watts and Zimmerman 1986, 
286—287).5 


The net impact on stock prices of these pronouncements, therefore, can only be assessed 
empirically. However, prior research shows that, on average, the impact of debt contracting costs 


* See, e.g., Wall Street Journal (July 31, 1986, 14; Sep. 3, 1986, 38; Nov. 4, 1987, 4; Dec. 21, 1987, 6; May 8, 1992, 5J) 

and New York Times (May 9, 1991, 5; Feb. 12, 1992, 22). - 

As asserted by McConnell et al. (1992, 13), we realize that not all companies will be favorably affected by the new 
standards. Specifically, it is possible that the new standards (SFAS No. 96 or No. 109) decrease equity and reduce 
deferred tax assets accrued before 1988 at rates in excess of 3496, as such assets have to be adjusted for the decrease 
in tax rates. This kind of adjustment is rather uncommon; deferred tax asset recognition under APB No. 11 is subject 
to a “recoverability” test when taxes payable exceed tax expense, and subject to "assurance beyond reasonable doubt” 
in the case of net operating loss carryforwards. Even if deferred assets were commonly recognized, however, empirical 
testing would be difficult because deferred tax assets (charges) are not separately reported on compustat (they are 
included in other assets). 

It is also possible that the new standards decrease equity and increase deferred tax liability, as both SFAS No. 96 
and No. 109 require recording a deferred tax liability on the difference between the tax and book bases of assets acquired 
in a "purchase" business combination. We allow for this possibility in our hypothesis tests. 

The premise of this paper, however, is that "the majority of firms" will be favorably affected by the new standards. 

Thus, we first hypothesize that "on average" there would be an increase in stock prices. The increase is not expected 
to be uniform across firms. Subsequent hypotheses therefore explore the relation between abnormal returns and firm- 
specific characteristics. 
Sufficient information is not available to quantify the impact on financial statements of SFAS No. 96 or No. 109 for all 
companies. In a Bear Stearns Investment Research paper, McConnell et al. (1992) discuss the expected or actual impact 
on 117 companies of adopting SFAS No. 109. Of the 117 firms, nine were (to be) unfavorably affected, 49 not materially 
affected, and 29 positively affected; the remaining 30 firms made no disclosure regardihg the magnitude of the impact 
of SFAS No. 109 on their financial statements. As McConnell et al. (1992, 12) acknowledge, “However, predicting the 
magnitude of the accounting change is generally [emphasis added] impossible." In addition, the McConnell et al. 
sample is not representative of all firms as it reflects firms that chose to adopt SPAS No. 109 early or disclose its impact; 
their sample also includes utility firms and financial institutions that are excluded from this study. Finally, it is the 
expected (not actual) effect on income that would drive stock prices around the initial announcements. 

* The increase in earnings would have no immediate cash flow effects, as taxes payable or refundable would not be altered. 
Theory suggests, however, that accounting regulations do affect stock prices even if they don’ tchange future cash flows 
(Watts and Zimmerman 1986). 

$ 'The impact on security prices of compensation plans is likely small in the context of accounting for income taxes, as 
compensation in many plans is based on pretax income. 
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outweighs the effects of political costs and the compensation contracts (see, e.g., Espahbodi et al. 
1991; Espahbodi and Tehranian 1989; Lys 1984; and Collins et al. 1981). As a result, itis expected 
that, on average, firms exhibit an increase in their stock prices in reaction to these generally 
income increasing announcements. Thus, our first hypothesis is: 


H1: Stock prices increase on average following announcements that increase the probability 
that the provisions of SFAS No. 96 or No. 109 will be enacted. 


The impact of the proposed standards on accounting for income taxes, of course, is expected 
to vary across firms depending on the magnitude of the income effect of these standards. Firms 
with larger deferred tax liability relative to their assets should experience a larger increase in their 
earnings. A further increase in earnings could be experienced by companies offering postretirement 
health-benefits, as SFAS No. 109 allows easier recognition of deferred tax assets for accruing 
suchcosts under SFAS No. 106. According to the Wall Street Journal (March 1, and May 9, 1991), 
and New York Times (May 9, 1991), the postretirement benefits rule was expected to reduce total 
corporate profits by as much as $200 billion to $1 trillion starting in 1993. Thus, SFAS No. 109 
could ease the impact on earnings of this postretirement benefits rule by as much as 34 percent 
or $68 to $340 billion. Finally, reported earnings could increase for companies with net operating 
loss (and tax credit) carryforwards because, under SFAS No. 109, a deferred tax asset is 
recognized (reducing the income tax expense and increasing earnings) if it is “more likely than 
not" that such carryforward benefits will be realized. On the other hand, the increase in earnings 
of some companies could be offset by the requirement of SFAS No. 96 and No. 109 to record a 
deferred tax liability on the difference between the tax and book bases of assets acquired in a 
purchase business combination. Thus, the increase in stock prices is expected to be more 
pronounced for firms with a larger deferred tax liability balance relative to their total assets, firms 
offering postretirement benefits, firms with higher net operating loss carryforwards relative to 
their total assets, and firms not using the purchase method of acquisition (or generally those with 
few or no acquisitions, as the pooling of interests method is fairly restricted). Our next four 
hypotheses, therefore, are 


H2: The stock price reaction to announcements related to accounting for income taxes is 
positively related to deferred tax liability over total assets. 


H3: The stock price reaction to announcements related to accounting for income taxes is 
positively related to the existence of postretirement benefits. 


H4: The stock price reaction to announcements related to accounting for income taxes is 
positively related to net operating loss carryforwards over total assets. 


H5: The stock price reaction to announcements related to accounting for income taxes is 
negatively related to acquisitions (divided by total assets) in purchase business combi- 
nations. 


The equity price reaction to SFAS No. 96 and No. 109 should also be related to the economic 
consequences of a given income effect. The proposed standards decrease debt contracting costs 
by decreasing liabilities and increasing owners' equity. The decrease in debt ratios benefits those 
firms close to debt covenant constraints. Similar to Espahbodi et al. (1991), a debt ratio is used 
to proxy for the existence and tightness of debt covenant restrictions. Using this proxy, the debt 
hypothesis suggests that, by avoiding potentially greater contracting costs associated with debt 
covenants, the more highly levered firms will experience larger stock price increases. The 
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positive association between stock price reaction to the events of interest and debt ratios can also 
represent a mechanical relation. Specifically, if the overall impact of the standards is to increase 
firm value then, other things equal, the more highly levered firms will have a larger percentage 
increase in their equity value (Espahbodi et al. 1991). Either of these explanations leads to our 
sixth hypothesis: 


H6: The stock price reaction to announcements related to accounting for income taxes is 
positively related to the firm's debt ratio. 


An accounting rule that increases income may increase the political costs associated with 
regulatory pressures and thus decrease stock prices (Watts and Zimmerman 1978). The political 
cost hypothesis predicts that larger firms will have a more negative stock price effect (Watts and 
Zimmerman 1986, 287 and 295). Using the market value of common equity as a proxy for firm 
size, our last hypothesis is: 


H7: The stock price reaction to announcements related to accounting for income taxes is 
negatively related to the market value of the firm. 


IIl. METHODOLOGY 
Sample and Data 


A sample of 500 firms was selected randomly from all those that met the following criteria: 
(1) listing on the 1991 Annual Report file of the NAARS data base which contains the financial 
statements and footnotes of about 4200 firms, including most of the NYSE, AMEX, and Fortune 
1000 companies as well as some OTC companies; and (2) not utility, finance, insurance, or real 
estate companies. 

The first criterion allows identification of companies with postretirement benefits. The 
second criterion excludes utility, finance, insurance, and real estate companies, as these compa- 
nies are affected differently by announcements regarding income tax accounting and there is no 
control group (i.e., no firms unaffected by income tax accounting) to control for these differences. 
Lack of return and financial data reduced the final sample to 420 firms which are used for testing 
all seven hypotheses." 

To test H2 through H7, additional data are obtained to measure the following six firm 
characteristics. The POSTRET variable takes on a value of zero if the firm offers no postretire- 
ment benefits, and a value of one if the existence of postretirement benefits is reported in the 1991 
Annual Report file of the NAARS data base. CARRYFWD is defined as tax loss carryforward 
benefits over total assets. DEFTAX is the deferred tax liability on the balance sheet over total 
assets. ACQU is acquisitions (cash outflow of funds used for and/or the costs relating to 
acquisition of companies) over total assets if the purchase method of acquisition is used, and zero 
otherwise. This variable is a proxy for the amount of deferred tax liability, as a fraction of total 
assets, to be recorded on the difference between the tax and book bases of assets acquired in a 
purchase business combination. If acquisitions qualify for pooling of interest, no deferred tax 
liability is recorded and ACQU takes a value of zero. Debt ratio (DEBTA) is measured as the year- 
end ratio of book value of total liabilities over the book value of total assets. Market value 


? We included firms with few missing daily returns in the analysis but excluded them from calculation of portfolio returns 
for the respective missing days. 
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(MKTVAL), measured in millions of dollars, is the product of the year-end stock price and 
common shares outstanding. The required data for these latter five variables are retrieved from 
Compustat. All these variables are measured as averages over the period covering the events of 
interest (years 1986-1991), except for DEFTAX. DEFTAX is measured as of 1986, because the 
income tax rate was lowered by the 1986 Tax Act from a maximum of 46 percent prior to and 
including 1986 to 40 percent in 1987 and to 34 percent in 1988; thus, even under APB Opinion 
No. 11, any additions to the deferred tax liability after 1987 would have reflected the 34 percent 
maximum tax rate and adoption of SFAS No. 96 or No. 109 would not have increased earnings 
further.* 

Descriptive statistics on these six variables are presented in panel A of table 1. The first four 
variables (DEFTAX, CARRYFWD, ACQU, and POSTRET) capture the magnitude of the 
income effect of SFAS No. 96 and 109, and the other two variables (DEBTA and MKTVAL) 
measure the economic consequences of a given income effect. Pearson correlation coefficients 
between these firm characteristics are presented in panel B of table 1. The null hypotheses of no 
correlation between the variable POSTRET and others (except ACQU) are rejected at the .01 or 
.05 significance level, suggesting that the existence of postretirement benefits is related to other 
firm characteristics (i.e., that firms with postretirement benefit packages are larger firms, with 
higher debt and deferred tax over total assets but lower loss carryforward benefits over total 
assets). Also, deferred taxes over total assets is positively related to market value (size), and 
negatively related to loss carryforward benefits over total assets. The highest correlation is a 
positive one between acquisitions and carryforward benefits, suggesting that active acquirers 
have higher tax loss carryforward benefits (frequently acquired through business combinations). 
To account for such interrelations, we use the methodology proposed by Sefcik and Thompson 
(1986). 


Events Examined 


The events (pronouncements) related to accounting for income taxes are compiled from the 
Wall Street Journal Index, the Wall Street Journal, the New York Times, the ABI/Inform data base, 
and the records of FASB. Multiple news sources are used to capture all the pronouncements 
related to accounting for income taxes (Thompson et al. 1987). Events prior to 1986 are ignored 
as the major impact of SFAS No. 96 on financial statements is due to reduction in tax rates as a 
result of the 1986 Tax Act. 

Twenty-seven events are identified that may be potentially significant to investors in 
informing them of theimpact on companies of the proposed rules and/or their likelihood of being 
passed. This study, however, only considers three events related to accounting for income taxes, 
two exposure drafts leading to SFAS No. 96 and No. 109 and the formal vote (5 to 2) by the FASB 
to revise SFAS No. 96to soften its impact on financial statements? Selection of Exposure Drafts 
as two of the significant events is predicated on prior studies finding a significant stock price 
change only around such announcements (see, for example, Leftwich 1981; Lys 1984; Salatka 
1989; and Espahbodi et al. 1991). Both events suggest that, for the majority of U.S. firms, profits 


* One issue concerning variable measurements is differences in the measurement periods. POSTRET and DEFTAX are 
measured in only one year, while the other firm characteristics (CARRYFWD, ACQU, DEBTA, and MKTVAL) are 
measured over a six-year period. However, when we tested H2 through H7 with all six firm characteristics measured 
in 1991, results were very similar in sign and significance to those reported in table 3, showing very little sensitivity to 
the measurement period. 

? Theoretically, other announcements before and after the Exposure Drafts may provide information on the proposed 
rules' impact and likelihood of acceptance. However, when all 27 events are considered, the results are essentially 
unchanged for the three events. Also, portfolio abnormal returns for the other 24 events are not significantly different 
from zero at the .05 level (see also footnotes 13 and 14). 


Fo 
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TABLE 1 
Sample Firm Characteristics Over the Years 1986-1991 


Panel A: Descriptive Statistics on Firm Characteristics 





Variable* Mean Median Standard Deviation 
DEFTAX 0.027 0.012 0.037 
POSTRET 0.260 0.000 0.439 
CARRYFWD 0.274 0.001 0.879 
ACQU 0.106 0.078 0.151 
DEBTA 0.556 0.571 0.213 
MKTVAL 899.375 62.061 4129.631 


Panel B: Pearson Product-Moment Correlations between Firm Characteristics 


DEFTAX | POSIRET CARRYFWD ACQU DEBTA MKTVAL 


DEFTAX 1.000 
POSTRET 0.231* 1.000 
CARRYFWD —0.184> —0.150* 1.000 
ACQU —0.004 —0.085 0.465* 1.000 
DEBTA —0.008 0.125* 0.042 0.082 1.000 
MKTVAL 0.173° 0.26% —0.065 —0.056 0.016 1.000 
'DEFTAX = deferred tax liability on the balance sheet over total assets, measured as of 1986. 
POSTRET =  Oifthe firm offered no postretirement benefits in 1991, and a value of 1 otherwise. 
CARRYFWD = tax loss carryforward benefits over total assets, averaged over 1986-1991. 
ACQU = isitions over total assets, averaged over 1986-1991, if the purchase method of acquisition is used; 0 
otherwise. 
DEBTA = book value of total liabilities over book value of total assets, averaged over 1986-1991. 
MKTVAL = product of the year-end stock price and shares outstanding, averaged over 1986-1991. MKTVAL is 


: measured in millions. 
Significant at the .01 level based on a two-tailed test. 
° Significant at the .05 level based on a two-tailed test. 


and net worth increase and liability decreases. As such, investors are expected to decrease their 
estimates of debt contracting costs, resulting in a positive equity price reaction to these events. 

The formal vote by the FASB is considered a significant event as it should alter stockholders’ 
expectations that SFAS No. 96 will be revised to soften its impact on financial statements. 
Investors are, therefore, expected to revise upward their probability estimates that debt contract- 
ing costs will be reduced, resulting in an increase in stock prices. Panel A of table 2 briefly 


10 Footnote 3 explains why some firms may not be favorably affected by the new standards. 
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TABLE 2 
Important Events Relating to Accounting for Income Taxes and Test of H1 
Panel A: Important Events 
Event Announcement Date Description 
l September 3, 1986 The Board issued an Exposure Draft on income tax reporting that would 


cause most U.S. firms to report a large one time increase in profits (even 
though it requires accrual of a tax liability for unrepatriated foreign 


earnings). 
2 October 2, 1990 The Board voted 5—2 to explore a proposal to revise SFAS No. 96. 
3 June 6, 1991 The Board issued an Exposure Draft of a proposed statement that increases 


corporate profits to supersede SFAS No. 96. 


Panel B: Test of HI 


Summary Statistics on abnormal returns around each of the three events. Estimated mean values.are dummy 
variable coefficients (y,) in Equation (1): 


~ . — E PAR 
R, za * BR, - V y, D, +Ë. 
e 
Each dummy variable (D_) takes a value of 1 during the three-day event period (the announcement date and 
the two surrounding trading days) and zero otherwise. Other statistics on abnormal returns (minimum, 
median, and maximum) are based on a set of seemingly unrelated regressions. 


Descriptive Statistics on Individual Security 


Portfolio Abnormal Return (in %) Abnormal Returns (in %) 
Event* . T-statistic 
(e) Mean on the Mean? Minimum Median Maximum 
l 1.25 1.764 -1.82 1.46 3.12 
2 1.31 1.93 ~1.01 1.12 2.16 
3 2.14 3.19^ -2.43 2.51 4.19 


* Events are described in Panel A. 

» These statistics account for the cross-sectional and time-series heteroscedasticity, and the contemporaneous correlation 
of residuals. 

* Significant at the .01 level based on a one-tailed t-test. 

4 Significant at the .05 level based on a one-tailed t-test. 


describes the three events studied. Each announcement date listed is the date of a Wall Street 
Journal or New York Times article. 

The relative significance of the formal vote to revise SFAS No. 96 to the two exposure drafts 
cannot be assessed a priori. Between the two Exposure Drafts, however, the one leading to SFAS 
No. 109 (event 3) is expected to result in a more significant portfolio abnormal return as SFAS 


m. 
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No. 109 allows recognition of deferred tax assets by anticipating future income." Net operating 
loss carryforwards were not normally recognized as assets under APB No. 11; under SFAS No. 
96, a benefit is recognized only when it can be applied to reduce deferred tax liabilities or taxes 
payable. SFAS No. 109 allows recognition of deferred tax assets for tax loss carryforwards and 
accrued postretirement benefits. 


Methodology 


The Multivariate Regression Model (MVRM) proposed by Schipper and Thompson (1983) 
is used to examine the average impact of each of the three events on stock prices of all firms (i.e., 
to test H1). By weighting the portfolio returns based on the estimated full covariance matrix of 
residuals, the MVRM accounts for both the cross-sectional heteroscedasticity and contempora- 
neous correlation of residuals. To correct for possible time-series heteroscedasticity, a procedure 
developed by White (1980) is employed in which the variance-covariance matrix of residuals is 
allowed to vary across observations (see Smith et al. 1986). 

The MVRM adds a zero-one dununy variable to the market model for each event, thus 
conditioning the return generating process on the occurrence or non-occurrence of that event. The 
coefficient of a dummy variable measures tbe impact on stock returns of the corresponding event. 
Because the exact timing of the information release is unknown, a three-day event period (i.e., the 
announcement date and the two surrounding trading days) is used. Each dummy variable takes 
avalueofone whenthe corresponding event occurred and zero otherwise. The modelis a portfolio 
equation: 


E 
R, =a+fR +> y,D, 6, (1) 
ez] 
where 
, = weighted portfolio return on day t (t = 1,2,...,T). T is the number of daily return 
observations from 1985 through 1992. To increase the efficiency of parameter estimates, 
" returns are weighted based on the estimated full covariance matrix of residuals; 
R,,7 return on the Standard and Poor's 500 Index on day t; 
& = intercept coefficient for the portfolio; 
B = risk coefficient for the portfolio; 
y, = the impact on the portfolio return of event e (e = 1, 2, ... E). E is the number of events 
considered (three in this study); 
D, = dummy variables that take a value of one during the three-day period (thea announcement 


date and the two surrounding trading days) of event e and zero otherwise; and, 
€, = random disturbance, assumed to be normally distributed and independent of the return 
on the market and the event announcement variables. ) 
Tests of H1 allow inferences about the impact on security prices of various events. To test 
H2 through H7 (i.e., to test the effect of firm characteristics on stock market reaction to these 
events), we use the portfolio weighting procedure advocated by Sefcik and Thompson (1986). By 


! No future income could be anticipated under SFAS No. 96. A net deferred tax asset is recognized, under SFAS No: 96, 
only when it would be recoverable through a carryback refund of taxes paid in the current or prior years. (Under APB 
Opinion No. 11, deferred tax charges [assets] could be recorded or increased subject to a recoverability test similar to 
other assets, as long as taxes payable exceeded tax expense.) The impact on stock prices of the Exposure Draft leading 
to SFAS No. 96 is also expected to be diminished, as it was proposing to require companies to recognize a deferred tax 
liability for unrepatriated foreign earnings (an issue that was excepted in SFAS No. 96). 
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accounting for potential collinearities among firm characteristics, this weighting procedure 
provides an opportunity to evaluate the relative importance of different firm characteristics in 
explaining the market reaction to pronouncements related to accounting for income taxes. 
The procedure involves the following steps. First, we define a matrix C having a column of 
1’s and (k—1) columns of firm characteristics, namely a dummy variable designating existence 
of postretirement benefits (POSTRET), net operating loss carryforwards over total assets 
(CARRYFWD);, deferred tax liability over total assets (DEFTAX), acquisitions over total assets 
for purchase business combinations (ACQU), the debt ratio (DEBTA), and market value of the 
firm (MKTVAL): 
C=[1 X,... X,], (2) 


where X, is an Nx 1 vector of the kth firm characteristic (K = 7 and N = 420, the number of firms 
in the sample). 
Next, we develop K=7 sets of portfolio weights: 
Wy 
Ww; 3 
W=| : |= (CC. o 


W' 


K 


Portfolio returns (R,,) for each set are then computed as follows: 


RaW Re Kel Zin By Dh2WT (4) 

where: 

W = KXN matrix of portfolio weights (K=7 and N=420 firms); 

W< = kth row of portfolio weights that are only influenced by the kth firm characteristic (by 
| a single column of C); 

C = NxKmatrix defined by Equation (2); 

R, = weighted return on portfolio k on day t; and, 

R, = Nxl vector of individual firms’ security returns on day t. 


Finally, we run k=7 OLS portfolio time-series regressions of the form: 
l ] l 
Ry 70, * f, Rit * Y Dau 6. (5) 
exl 


The event parameter estimates (or dummy variable coefficients, 7;.) in each portfolio regression 
measure the effect of only one firm characteristic on stock market reaction to each of the three 
events. These estimates are the same as those in OLS cross-sectional regression of abnormal 
returns on firm characteristics (or regression of dummy variable coefficients in Equation (1) on 
firm characteristics). However, “unlike cross-sectional regressions, the standard errors of these 
estimates account fully for the cross-correlation and (cross-sectional) heteroscedasticity in firm 
disturbances ... No check for IID disturbances is necessary" (Sefcik and Thompson 1986, 324). 


IV. RESULTS 


Panel B of table 2 reports the summary statistics on the portfolio abnormal returns for the 
three-day period (the announcement date and the two surrounding trading days) around each of 
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the three events. Estimated mean values are the dummy variable coefficients in Equation (1). 
Event 3, the issuance of the Exposure Draft leading to SFAS No. 109, is the most significant one. 
For this event, the mean (average) three-day abnormal return is 2.14%. The t-statistic is 3.19, 
significantly different from zero at the .01 level. Events 2 and 1, with average abnormal returns 
of 1.31% and 1.25%, are also significant at the .05 level. All three of these abnormal returns have 
the predicted signs and are consistent with H1, indicating that stock prices have reacted favorably 
to announcements that the provisions of SFAS No. 96 and particularly SFAS No. 109 will be 
enacted.? The positive abnormal returns imply that, on average, the positive impact of debt 
contracting costs does outweigh the negative effects of increased political costs and compensation 
contracts.4 

The results are consistent with expectations and prior research. Significant abnormal returns 
are observed around both Exposure Drafts and the formal decision (vote) to revise SFAS No. 96 
and soften its impact on financial statements. Compared to the Exposure Draft leading to SFAS 
No. 109, the stock price reaction to the Exposure Draft leading to SFAS No. 96 was less 
pronounced. This result was expected, since the latter Exposure Draft proposed recognition of a 
deferred tax liability for the unrepatriated earnings of foreign subsidiaries and did not allow 
recognition of a deferred tax asset in anticipation of future income. 

Table 3 presents the results for H2 through H7. Each column in that table reports the event 
parameter estimates (or dummy variable coefficients, ¥_) in one of the seven portfolio regressions 
described by Equation (5). The event parameter estimates in each regression measure the effect 
of the corresponding firm characteristic on stock market reaction to each of the three events. 

For example, the second column of table 3 presents the estimated coefficients of dummy 
variables in Equation (5) for the deferred tax liability over total assets (DEFTA X) portfolio. For 
event 3, the coefficient is positive and statistically significant at the .01 level as indicated by the 
t-statistic of 3.76. The coefficient is also significant for event 1, Exposure Draft leading to SFAS 
No. 96, but at the .05 level. For these events, the results are consistent with H2 that the stock price 


'2°Fo ensure that the significant portfolio abnormal returns reported in panel B of table 2 are not driven by outliers, a set 
of seemingly unrelated regressions is run. Descriptive statistics on individual security abnormal returns, also reported 
in panel B of table 2, are examined around each of the three events. The median abnormal returns are all positive, 
suggesting that the majority of firms had a positive abnormal return around each of the three events. To be exact, 328, 
257, and 242 of the 420 abnormal returns are positive around events 3, 2, and 1, respectively. These numbers are 
significant at the .05 level using the binomial test, suggesting that results are not driven by outliers. 

D To further explore the impact of.these events on the average and individual security prices, two sub-hypotheses related 
to the first hypothesis are tested following Schipper and Thompson (1983): (1A) Average (across all firms) impact on 
stock prices of each event is zero; and (1B) The impact of each event on stock prices of every firm is zero. Both sub- 
hypotheses are rejected for events 1, 2, and 3 at the .01 or .05 significance level based on either the full or diagonal 
covariance matrix specification. 

When all the 27 events relating to accounting for income taxes are considered, neither of the sub-hypotheses could 
be rejected for the other 24 events. That the second sub-hypothesis (1B) could not be rejected for the other 24 events 
implies that no firm had a significant price reaction to these events. If abnormal returns are not significantly different 
from zero for any firm, it would be very unlikely to find a significant relation between abnormal returns and firm 
characteristics. Test of H2 through H7 considering all the 27 events confirmed this belief. 

4To highlight the differences in abnormal returns across industries, a system of five equations (each similar to Equation 
(1)) is estimated using five industry portfolios. The average portfolio abnormal returns and the t-statistics across 
industries indicate that stock prices of manufacturing firms are most favorably affected by these three events. At the .05 
significance level, agriculture, mining, and construction firms do not exhibit a significant abnormal return around event 
1 while transportation, communication, and hazardous waste management firms have no significant abnormal return 
around event 2 (these t-statistics are significant at the .1 level). There is no significant equity price reaction to any event 
for service or wholesale and retail firms. 

When all the 27 events are considered, there is no significant equity price reaction to the other 24 events in any 
industry at the .05 level of significance. 
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reaction to the proposed standards on accounting for income taxes is positively related to deferred 
tax liability overtotal assets. Similarly, results reported in columns 3 through 7 are consistent with 
H3 through H7. Specifically, results indicate that stock price response is positively related to the 
existence of postretirement benefits (POSTRET), net operating loss carryforwards over total 
assets (CARR YFWD), and debt ratio (DEBTA); and negatively related to acquisitions over total 
assets (ACQU), and market value of the firm (MKTV AL). 

Overall, the results for the six portfolios show significant portfolio abnormal returns and 
cross-sectional relations between the security returns and the firm characteristics on the two 
Exposure Draft announcement dates at the .05 level. While a significant portfolio abnormal return 
is observed around the formal decision to revise SFAS No. 96 (event 2), the abnormal return is 
only associated with tax loss carryforward benefits over total assets. 

The relations are strongest for the Exposure Draft leading to SFAS No. 109 (event 3), as was 
the portfolio abnormal return. This result seems intuitive as SFAS No. 109 allowed recognition 
of deferred tax assets by anticipating future income. Net operating loss carryforwards were not 
normally recognized as assets under APB No. 11; under SFAS No. 96, abenefitis recognized only 
when it can be applied to reduce deferred tax liabilities or taxes payable. SFAS No. 109 broadens 
the conditions allowing recognition of deferred tax assets for tax loss carryforwards and accrued 
postretirement benefits. 


V. CONCLUSION 


Since the issuance of APB Opinion No. 11 in 1967, accounting for income taxes has been the 
subject of criticisms and concerns. The FASB issued SFAS No. 96 in 1987, the effective date of 
which was delayed three times and eventually superseded by SFAS No. 109 in 1992. These new 
standards will have a significant impact on firms' financial statements and, based on contracting 
and political cost hypotheses, should impact stock prices. 

This study documents empirical evidence of the relation between firm characteristics and the 
market reaction to two generally income increasing standards. The study also highlights the 
Offsetting effects on stock prices of various accounting rules. It shows that the income tax rule, 
by easing the earnings impact of the postretirement benefits rule, results in relatively higher 
positive abnormal returns for firms offering such benefits.? 

The three-day abnormal returns around the two Exposure Drafts leading to SFAS No. 96 and 
No. 109, and the FASB’s decision to revise SFAS No. 96, are 1.25%, 2.14%, and 1.3196; these 
abnormal returns have the expected signs and are significantly different from zero at the .05, .01, 
and .05 levels respectively. The portfolio weighting procedure indicates that the equity price 
reaction to these standards is related to the magnitude of their income effects and the economic 
consequences of a given income effect. Specifically, we show that those most favorably affected 
by the new standards are small firms, firms offering postretirement benefits, firms with high debt 
ratios, firms with high deferred tax liability and loss carryforward benefits over total assets, and 
firms notusing the purchase method of acquisition (or generally those with few or no acquisitions, 
as the pooling of interest method is fairly restricted). 

The significant stock price response in this stüdy suggests that even if a standard does not 
increase or decrease the acceptable set of accounting procedures, its income effect alone has 


“This study also documents that equity price reaction to a mandated accounting change varies across industries (see 
footnote 14). 
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important economic consequences. In proposing. future accounting standards, therefore, policy 
makers should bear in mind that equity price reaction to a standard is related to the magnitude of 
its income effects and the resulting economic consequences. 
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LAWRENCE A. PONEMON and DAVID R. L. GABHART, Ethical Reasoning in Accounting 
and Auditing, (Vancouver: CGA-Canada Research Foundation, 1993, pp. xvi, 148, $14.00). 


Academicians, graduate students and practitioners interested in accounting ethics should read this 
latest monograph produced by the CGA-Canada Research Foundation for all the wrong reasons. By this I 
mean that the authors' stated purposes for producing this volume are not the real reasons for buying and 
reading it. Instead, Ethical Reasoning in Accounting and Auditing will occupy the shelves of most serious: 
accounting ethics researchers for much the same reason that James Rest’s Moral Development does: it states 
succinctly the findings and implications of a broad range of relevant accounting ethics research. 

Ponemon and Gabhart’s stated purpose for the book is “to explore the influence of ethical reasoning 
upon domain-specific judgment and behavior" (p. vii). Specifically, the authors include the results of their 
own previously unpublished studies assessing (1) the impact of cross-national (Canadian vs. American) 
differences in the auditing professions on practitioners' ethical judgments and (2) the influence of ethical 
reasoning on a variety of audit judgments. Among other results, Ponemon and Gabhart find that Canadian 
practitioners are higher level and less homogeneous moral reasoners, and that higher level moral reasoners 
are more sensitive to the problems of auditing managers who are highly competent but low on integrity. 
While some of the results are interesting, they are not likely to represent the primary influence of this book. 

But I believe the book's influence will be substantial. The first half of the monograph will make the 
accounting ethics literature accessible to all for a small investment of time. The chapter on factors affecting 
ethical reasoning in accounting and auditing will likely stimulate research by Ph.D. students the way the 
Research Opportunities in Auditing volumes have. The book also provides a feel for how the research is 
linked to real-world behavior. As a result, the book may provide several weeks of discussion in appropriate 
graduate level accounting courses. 

Ponemon and Gabhart also use three cases to describe how people at KohIberg's various stages of moral 
reasoning might respond. This process is helpful for the reader because it more clearly defines the 
implications of reasoning at the various levels. Perhaps the most eye-opening implication is the freedom 
post-conventional (or higher level) moral reasoners feel to exempt themselves from rules that conflict with 
their self-chosen ethical principles. 

Two weaknesses of the monograph are worth noting. First, the scope of the book is narrower than the 
title would indicate. The focus of the entire monograph is on ethics in the external auditor-client and auditor- 
auditor relationships. Also, while admitting that the book deals with the practitioner's ezhical reasoning 
ability and not with personal integrity, the authors do not provide any real insight into this critical difference. 

Despite being short enough to be included as a segment of a graduate course, those doing accounting 
ethics research will want to have this book as a reference. Even those not interested in or focused on the 
Kohlbergian cognitive moral development paradigm should be aware of its influence on accounting ethics 
research. This paradigm, for better or worse, has framed discussion of ethics in the profession among 
academics for the last decade. Ponemon and Gabhart punctuate its influence with this monograph. 

MICHAEL K. SHAUB 


l Associate Professor of Accounting 
Hillsdale College 


669 


670 The Accounting Review, October 1995 


GLENN VAN WYHE, The Struggle for Status: A History of Accounting Education, First Edition, 
(New York and London: Garland Publishing, Inc., 1994, pp. vvi, 263). 


The author of this worthwhile book should be commended for his thorough research of numerous items 
from various professional and academic organizations addressing the history of accounting education. The 
author begins with a discussion of cultural capital and the purposes and goals of a profession with the ultimate 
goal of establishing “buying power.” He clearly indicates that the central issue of education policy surrounds 
the matter of curriculum; thus, the central issue of accounting education involves integrating the expanding 
body of relevant knowledge into the curriculum. Since it is impractical to include everything needed to 
prepare people to enter the profession in a professional accounting program, prioritizing subject matter 
becomes of prime importance. 

The author ably, in chapter one, reviews the beginning of collegiate accounting education which 
‘includes such academic subjects as mathematics, natural sciences and languages. It is clear that even from 
the beginning of accounting education, curriculum builders wanted to provide a high quality liberal arts 
education that focused on subject matter that would be "relevant" to someone entering a profession. That 
focus may very well have originated with our humble beginnings, Pacioli, who some say established an 
. intellectual basis for accounting. Even in the beginning of collegiate education, focusing subject matter from 

a theoretical to a practical viewpoint was a major thrust. Professor Hatfield’s elementary accounting course 
concentrated on the interpretation of information from the view of the business manager rather than from 
the view of the professional bookkeeper or accountant. Thus, some people feel that it is clear that substantial 
emphasis was placed on a liberal education at the very beginning of the education of professional 
accountants. l 

At the turn of the century and the passing of the first CPA law in New York in 1896, an additional 
question arose regarding accounting education and its focus. Many accounting practitioners of that era were 
strong supporters of the conceptual approach to accounting education. The question quickly arose as to the 
role of accounting education in a university and the role of accounting experience in pigviding an adequate 
education for the professional accountant. 

The historical record also indicates the fifth year of college education of CPAs first surfaced at 
Columbia University, based on a survey by Professor Ralph Kester. He envisaged a program with two years 
of liberal'arts and three years of technical training. The model was patterned after that in schools of law. Four 

-decades later, a prominent ivy league professor, John C. Sandy Burton, also supported a fifth year of study 
. to professional programs. Whenever accounting education is discussed and curriculum changes considered, 
the old question of what the learning experience he or she should get in a practical experience environment 
arises. Nine decades after Dr. Hatfield's course, we continue to have substantial discussions regarding how 
accounting should be taught. 

— The author of this research-based book, Professor Van Wyhe, reviews various academic and profes- 
sional studies conducted under the auspices of organizations interestedin accounting education. He provides 
athorough analysis of the introduction of management accounting during the post-war years, indicating the 

need for accounting executives in business enterprises to be a legitimate part of the management team. To 
. adequately prepare accountants for such roles, the question again arises whether accounting education 
should take a broad view recognizing academic legitimacy and intellectual content coupled with the ability 
to integrate the subject matter into a business environment. 

The author recognizes the importance of accounting as a communication process through which useful 
information is provided to various user groups. The public accounting arm of the profession brought forth 
its role as a "business consultant." The discussion of the quality of accounting education is a thread that can 
be found in each of the decades covered by this book. One element of this education which has persisted is 
whether or not graduates of accounting programs have an adequate understanding of written and spoken 
English. 


Accounting education studies have consistently indicated that the content should have a clear and 
professional academic content and that both the accounting and liberal arts components should be 
intellectually challenging. The real difference of opinion between practitioners and educators, and educators 
and educators, has been directed toward teaching methodology, application to subject matter, intellectual 
content, conceptual versus practical approaches and the quality and depth of a liberal education. The general 
consensus has been that the liberal arts education should provide the student with the ability to communicate 
effectively while enabling him or her to function in a free market economy with an understanding of the 
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social, economic and political influences in various parts of the world. The focus should not be on the 
elementary accounting course but on accounting as it relates to and is driven by business. 

Various foundation and accounting organization reports issued in the 1950s and 1960s had significant 
influence on the accounting education. Some of these studies focused on business education, but influenced 
the accounting curriculum. Some of the reports were industry-specific and related to the content of 
accounting programs. Again, the old questions surfaced concerning fifth-year accounting programs and how 
organization structures should be designed to administer them. 

The curriculum content continues to focus on the analytical tools of economics, quantitative and 
statistical tools, mathematics, computers and systems, coupled with an understanding of the national and ` 
international marketplace. Various study groups have developed broad-based doctrines relating to the 
delivery system of these academic requirements. The lack of independence in accounting education has 
prevented educators from integrating all of these elements into a forward, modem-looking educational 
program. The challenge we face today is whether or not we have the courage, the intellectual curiosity and 
determination to move beyond the elementary accounting course and develop a legitimate, quality, 
integrated university accounting program for the future professional. 

It is important for those of us evaluating curriculum content in academic programs to read this important 
book on the history of accounting education. A review of our history can be a significant learning experience. 

JAMES DON EDWARDS 
J. M. Tull Professor of Accounting 
University of Georgia 
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TABLES AND FIGURES 


The author should note the following general requirements: 

]. Eachtable and figure (graphic) should appear on a separate page and should be placed at the end of the 
text. Bach should bear an Arabic number and a complete title indicating the exact contents of the table 
or figure. 

A reference to each graphic Sinai De ade hoe eke 

The author should indicate by marginal notation where each graphic should be inserted in the text. 
Graphics should be reasonably interpreted without reference to the text. 

Source lines and notes should be included as necessary. 


Tiati: Equations should be numbered in parentheses flush with the right-hand margin. 
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DOCUMENTATION 


Citations: Work cited should use the “author-date system” keyed to a list of works in the reference list 
(see below). Authors should make an effort to include the relevant page numbers in the cited works. 


1. Inthe text, works are cited as follows: authors’ last name and date, without comma, in parentheses: for 
example, (Jones 1987); with two authors: (Jones and Freeman 1973); with more than two: (Jones et al. 
1985); with more than one source cited together (Jones 1987; Freeman 1986); with two or more works 
by one author: (Jones 1985, 1987). 

2. Unless confusion would result, do not use “p.” or “pp.” before page numbers: for example, (Jones 1987, 
115). 

3. When the reference list contains more than one work of an author published in the same year, the suffix 
8, b, etc. follows the date in the text citation: for example, (Jones 1987a) or (Jones 1987a; Freeman 
1985b). 

4. Ifan author’s name is mentioned in the text, it need not be repeated in the citation; for example, “Jones 
(1987, 115) says...." 

5. Citations to institutional works should use acronyms or short titles where practicable; for example, 
(AAA ASOBAT 1966); (AICPA Cohen Commission Report 1977). Where brief, the full title of an 
institutional work might be shown in a citation: for example, (ICAEW The Corporate Report 1975). 

6. If the manuscript refers to statutes, legal treatises or court cases, citations acceptable in law reviews 
should be used. 


Reference List: Every manuscript must include a list of references containing only those works cited. Each 
entry should contain all data necessary for unambiguous identification. With the author-date system, use the 
following format recommended by the Chicago Manual: 


1. Arrange citations in alphabetical order according to surname of the first author or the name of the 
institution responsible for the citation. 

Use author’s initials instead of proper names. 

Dates of publication should be placed immediately after author's name. 

Titles of journals should not be abbreviated. 

Multiple works by the same author(s) should be listed in chronological order of publication. Two or more 
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works by the same author(s) in the same year are distinguished by letters after the date. 
6. Inclusive page numbers are treated as recommended in Chicago Manual section 8.67. 
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American Accounting Association, Committee on Concepts and Standards for External Financial Reports. 
1977. Statement on Accounting Theory and Theory Acceptance. Sarasota, FL: AAA. 
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of Accounting Research 27 (Spring): 40-58. 
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Fabozzi, F., andI. Pollack, eds. 1987. The Handbook of Fixed Income Securities. 2d ed. Homewood, IL: Dow 
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Cambridge, United Kingdom: Cambridge University Press. 

Porcano, T. M. 1984a. Distributive justice and tax policy. The Accounting Review 59 (October): 619-36. 
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Footnotes: Footnotes are not used for documentation. Textual footnotes should be used only for extensions 
and useful excursions of information that if included in the body of the text might disrupt its continuity. 
Footnotes should be double spaced and numbered consecutively throughout the manuscript with superscript 
Arabic numerals. Footnotes are placed at the end of the text. 


SUBMISSION OF MANUSCRIPTS 
Authors should note the following guidelines for submitting manuscripts: 


1. Manuscripts currently under consideration by another journal or publisher should not be submitted. The 
author must state that the work is not submitted or published elsewhere. 

2. In the case of manuscripts reporting on field surveys or experiments, four copies of the instrument 
(questionnaire, case, interview plan or the like) should be submitted. 

3. Fourcopies should be submitted together with a check in U.S. funds for $50.00 for members or $100. 00 
for nonmembers of the AAA made payable to the American Accounting Association. Effective January 
1990, the submission fee is nonrefundable. 

4. The author should retain a copy of the paper. 

5. Revisions must be submitted within 12 months from request, otherwise they will be considered new 
submissions. 


COMMENTS 


Comments on articles previously published in The Accounting Review will be reviewed (anonymously) 
by two reviewers in sequence. The first reviewer will be the author of the original article being subjected 
to critique. If substance permits, a suitably revised comment will be sent to a second reviewer to determine 
its publishability in The Accounting Review. If a comment is accepted for publication, the original author 
will be invited to reply. All other editorial requirements, as enumerated above, also apply to proposed 
comments. 
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" POLICY ON REPRODUCTION 


An objective of The Accounting Review is to promote the wide dissemination of the results of systematic 
scholarly inquiries into the broad field of accounting. 
Permission is hereby granted to reproduce.any of the contents of the Review for use in courses of 


` instruction, as long as the source and American Accounting Association copyright are indicated in any such 


reproductions. 

Written application must be made to the Editor for permission to reproduce any of the contents of the 
Review for use in other than courses of instruction—e.g., inclusion in books of readings or in any other 
publications intended for general distribution. In consideration for the grant of permission by the Review 
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Except where otherwise noted in articles, the copyright interest has been transferred to the American 
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applicants must seek permission to reproduce (for all purposes) directly from the author(s). 


POLICY ON DATA AVAILABILITY 
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to provide the widest possible dissemination of knowledge based on systematic scholarly inquiries into 
accounting as a field of professional research, and educational activity. As part of this process, authors are 
encouraged to make their data available for use by others in extending or replicating results reported in their 
articles. Authors of articles which report data dependent results should footnote the status of data 
availability and, when pertinent, this should be accompanied by information on how the data may be 
obtained." 
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A Ph.D. is required for all positions. Competitive salary/fringe benefits. Women and minorities are 
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